Sharing ELT & DQ Events
urn:js:virtue:aspire:proposal:24.1
TL;DR
All squads who are responsible for the provision of data into ASPIRE to publish a common message indicating the status of their ETL or DQ processes
Rational
As ASPIRE becomes both more critical to the business and complex the interdependencies and reuse of data between squads is growing. We need to automate the monitoring of failures and identify where development changes impact these processes.
- Multiple Squads – we are growing in data reuse across squads and do not want to tightly couple ETL DAGS creating cross squad friction / development dependencies.
- Event driven v Time Driven – Decoupling from time driven methods will enable the ability to deliver data to outbound products as soon as it is available and in the event of failure a reload / restart of a ETL DAG can automatically trigger out bound refreshes e.g. MicroStrategy Cubes
- Lineage & Impact – Required for integration into Valkyrie & Stardew support
- Analyst triggers – other applications / tools / ML etc. can easily trigger their own processes based off a recognised event schema.
Proposal
- All squads who are responsible for the provision of data into ASPIRE to publish a common message indicating the status of their ETL or DQ processes
Options
- A single event to cover both ETL & DQ with a common / flexible structure
- Two event types with different structures
Structure for separate event types
ETL
- Event Name ()
- Event Sainsburys Tech Owner (ASPIRE owning squad)
- Event Status (Started, Running, Completed, Failed)
- Event DateTime
- Event Source To Target (Table or Bucket - Can be an Array )
- Next Event
DQ
- Event Name ()
- Event Sainsburys Tech Owner (ASPIRE owning squad)
- Event Status (Started, Running, Completed, Failed)
- Event Outcome (Pass, Fail, Warning)
- Event DateTime
- Event Target (Table or Bucket)
Structure for single event
- Event Name ()
- Event Sainsburys Tech Owner (ASPIRE owning squad)
- Event Type (ETL, DQ)
- Event Status (Started, Running, Completed, Failed)
- Event Outcome (Pass, Fail, Warning - only for DQ)
- Event DateTime
- Event Source To Target (Table or Bucket - Can be an Array)
- Next Event
Examples for separate event types
-ETL Event
{ “Event Name”: “Bload1, “Sainsburys Tech Owner”: “Banana Squad”, “Status: “Started”, “Event”DateTime: “03/11/2021:05:15”, “Event Source To Target”: { “ADW_XXX_XXX:ADW_AAA_BBB”, “ADW_CCC_CCC: ADW_DDD_DDD” }, “Next Event”: “Bload2”, }
-DQ Event
{ “Event Name”: “Qtest1, “Sainsburys Tech Owner”: “Banana Squad”, “Status: “Started”, “Event Outcome”: “Warning” “Event”DateTime: “03/11/2021:05:15”, “Event Target”: “ADW_EEE_EEE” }
Implications
None.