Link Search Menu Expand Document

S3 to Staging Pipeline

urn:js:virtue:aspire:pattern:.

TL;DR

A pattern for loading S3 Raw landing data into Snowflake staging tables.

Instructions

This patterns show the flow of data from a producer through to ASPIRE staging. It is not intended to enforce specific technologies but rather show key components and data flows.

Architecture diagram

alt_text

Key components:

  • Objects landed in S3 Raw are done through an assumed role in the landing account to ensure the objects are owned by the landing account
  • Data should flow from S3 to Snowflake staging directy using a Snowflake COPY INTO statement - Engineering patterns for how the COPY INTO is invoked will be documented separately
  • Data in S3 is accessed in Snowflake through a Snowflake External Stage over PrivateLink
  • Data flows and services will be orchestrated outside of Snowflake