Retain As-Received Data
urn:js:virtue:aspire:principle:9.1
TL;DR
The data feeds received from the source systems must be stored in an “as-received” state – i.e. before any filtering, transformation, aggregation, etc. is applied.
Rational
The key drivers are:
- Minimise source system impact – In the event of an error, it should be possible to rollback and reload the data into downstream layers without the source systems having to recreate the feeds unnecessarily.
- Historic view of received data – Provide the capability to retrospectively view the data as it was received. This is particularly useful as operational systems often tend to maintain only the current view.
- Operational data archive – Provide a read only data archiving service for decommissioned operational systems.
- Aid data reconciliation – Having a copy of the data in its received state will facilitate data reconciliation activities.
Implications
The potential implications are:
- Higher storage costs – Keeping a copy of all the feeds for the appropriate retention periods will require significant storage. However, this can be mitigated by opting for a cost-effective storage solution.
- Retention policy – As-received records should be stored for the maximum length of time permitted by the data retention policy. After this point in time the records must be deleted automatically.
- Change management – Effort is required to ensure that changes to the data structure of source feeds can be accommodated, and that retention of “as-received” data can continue.