Load data as far as possible as soon as possible
urn:js:virtue:aspire:principle:13.1
TL;DR
Data should not be left in files or source systems until a set time of day, they should be loaded as soon as they are available.
Rational
Where a final presentation layer table has a dependency on another source then data should still be loaded to the RDV/BDV and wait for the dependent table to load before then continuing to load to the PL as soon as possible.
Data pipelines should also be designed to support multiple loads in day where historically they may only have been fed once a day, this enables a reduction in data latency.
This early loading enables other use cases to be created from data that may be able to take advantage of the earlier arriving data.
It also gives more time to react to process or quality issues when a source is immediately loaded rather than waiting for a set time of day, often outside normal business hours.
Implications
The early loading principle creates a dependency on orchestration solutions to be able to suspend a pipeline and wait for dependent data to load before resuming.
Where end users request not seeing data changing until a set time, mechanisms to provide them historic or point-in-time views should be employed allowing other users access to the most up to date data.