Link Search Menu Expand Document

Data Integration

Batch ETL

Principle TL;DR
Read VARIANT through VIEW (...) Principle for reading data held in VARIANT through a VIEW
S3 to Staging Pipeline (...) A pattern for managing data from S3 to Snowflake staging

batch ETL

Principle TL;DR
Data Store Ingestion Flow (...) Data Store Ingestion Flow

Data Development

Principle TL;DR
Load data as far as possible as soon as possible (...) Data should not be left in files or source systems until a set time of day, they should be loaded as soon as they are available.
One Way In (...) Data feeds can arrive via any one of the agreed strategic routes.
One Way Out (...) Access for consumers must only be provided in a controlled manner via the data access layer.
Process All Data (...) Where the appropriate metadata has been provided, all incoming data must be loaded, processed and made available for provisioning via the data access layer.
Retain As-Received Data (...) The data feeds received from the source systems must be stored in an “as-received” state – i.e. before any filtering, transformation, aggregation, etc. is applied.
Source Once, Use Many Times (...) The data feed should include all records, entities and attributes at the lowest level of granularity.
Store Record Level Data (...) Data must be stored at the lowest level of granularity available

Data Federation

Principle TL;DR
Access Controls (...) Access Controls
Data Security (...) Data Security
Data Source (...) Data Source
Data Utilisation (...) Data Utilisation
Descriptive Info (...) Descriptive Info
Fact Structures (...) Fact Structures
Multi Brand Multi Channel (...) Multi Brand Multi Channel
Naming Standards (...) Naming Standards
No Data Quality Metrics (...) No Data Quality Metrics
No Re-Engineering (...) No Re-Engineering
ownership (...) ownership
Synonymous With the Presentation Layer (...) Synonymous With the Presentation Layer