Link Search Menu Expand Document

Data Store Ingestion Flow

urn:js:virtue:aspire:principle:31.1

TL;DR

The data store ingestion flow is : Raw Data Layer -> Data Vault -> Transform Layer -> Presentation Layer.

Rational

Raw Data Layer

A landing area for all files containing data to be made available through the ASPIRE ecosystem. The data will not be stored in domain specific structures and will not be persisted for the same duration as with the other data stores.

The files in the Raw Data Layer will be in the original publishd form and unadulterated.

Data Vault

Holds all information required to be retained within a formal, time variant, non-volatile, persistent data structure for reporting and integrated analysis. The data stored in this area will be held in such a way as to allow all available data relationships to be published to the presentation database as necessary.

Transform layer

This layer will transform the data into the Presentation Layer data model, it will apply the business rules and the data quality checks that are necessary to do this.

Presentation Layer

The presentation databases hold physical or dynamic views of the data necessary to support SQL based use cases with conformed business rules applied. Data will be structured in accordance with the anticipated usage types with some metrics and KPIs repeated between data sets. For example the MicroStrategy reporting tool works optimally when connected to Snowflake (not the technology) data structures as opposed to analytical users who may find it fasted to query from flattened data tables containing multiple attributes.

Implications

None.

Appendix

Data Vault Modelling