Ingestion Standards for Graph Database
urn:js:virtue:aspire:standard:1.1
TL;DR
Standards for ingesting data into an graph database.
Definition
- Data captured for Graph DB Ingestion(through the various processes) must be landed into the Aspire Data Lake Raw Bucket before applying any filtering, cleansing or aggregation logic
- Data from the Raw Data Lake should be loaded into the Snowflake database to enforce common business entities and Data Modelling Standards.
- Cleansing / DQ or any other Transformation Rules for defining the nodes and edges should be applied outside of the Graph Database(In Snowflake)
- Data for the Graph Database should be sourced from the Snowflake Objects depending on the use case either from (4a) RDV when no transformation or Business alignment is required or (4b) BDV when there is a requirement for transformed and Business aligned data or (4c) PL when there is a requirement to visualise the data using traditional method e.g. Microstrategy in addition to the Graph Database Use Case
- Graph Database should be used only for traversing purpose. No transformation logic or Rules should lie within it
- Aspire Principles/Standards must be observed for any PII or Commercially Sensitive Data