Link Search Menu Expand Document

Apply S3 Intelligent Tiering By Default

urn:js:virtue:aspire:proposal:14.1

TL;DR

This decision looks to agree a standard of applying S3 Intelligent Tiering on all Data Lake buckets by default.

Rational

This decision looks to agree a standard of applying S3 Intelligent Tiering on all Data Lake buckets by default. This should include archive tiers and will apply to both Raw and Curated Data Lake accounts. This can then be overridden with a more specific lifecycle policy if necessary but at least we will have a default position going forward.

At present Terraform configuration does not exist for archive tiers but Banana are looking to contribute this to Terraform so we can avoid manual intervention.

Analysis

Intelligent Tiering (IT) has been apply by Banana following some analysis (https://sainsburys-confluence.valiantys.net/pages/viewpage.action?spaceKey=GDB&title=Bucket+Lifecycle+Policies).

Key features

  • For a small management fee AWS will monitor object access and will move to lower tiers once an object hasn’t been accessed for a set number of consecutive days.
    • Frequent Access -> Infrequent Access : 30 days (saves ~45% cost p/GB)
    • Infrequent Access -> Archive Access : min 90 days but configurable (saves ~85% cost p/GB)
    • Archive Access -> Deep Archive: configurable (saves ~95% cost p/GB)
  • No retrieval fees although any object accessed will go back into Frequent Access and start again.
  • Monitoring is $0.0025 per 1000 objects per month
  • Only relevant for objects over 128KB in size - lower won’t be monitored or moved

Full details here: https://aws.amazon.com/blogs/aws/s3-intelligent-tiering-adds-archive-access-tiers/

Recommedation

  • We update our common modules to ensure Intelligent Tiering is applied for all new data lake buckets
  • We apply Intelligent Tiering to all existing buckets that have no policy already

Implications

None.