Advanced Amazon S3 Flashcards
LifeCycle Rules - Transition Actions
move obj to Standard IA 60 days after creation
move to Glacier for Archiving after 6 months etc
LifeCycle Rules - Expiration Actions
configure obj to expire (delete) after some time
Access log can be deleted after 365 days
Can be used to delete old version of files (if versioning enabled)
Can delete incomplete multi-part uploads
Can you create LifeCycle rules for certain prefix? - s3://mybucket/mp3/*
Yes
Can you create LifeCycle rules for certain object Tags? (e.g. Department: Finance)
yes
What can Amazon S3 Analytics do with Storage classes?
Help decide when to transition objects to right storage class
Which Storage classes does Amazon S3 Analytics work with?
Only Standard and Standard IA
S3 - Requester Pays
Option to enable for the requester to pay instead of the owner of the bucket
S3 general charging model?
Bucket owners pay for all S3 storage and data transfer costs associated with their bucket
When would you use Requester Pays?
If you have an object that is too big and you don’t want to be charged
Use cases for Requester Pays?
When you want to share large datasets with other accounts?
What does the requester must be in order to download objects from a bucket?
Authenticated in AWS (cannot be anonymous)
S3 Event Notification use cases
Generate thumbnails of images uploaded to S3
What are the 3 main destinations for S3 events notifications?
SNS
SQS
Lambda Function
S3 Event Notifications with Amazon EventBridge (architecture)
Events go into S3 bucket, then all of them are sent over to EventBridge.
EventBridge can then send events, depending on rules set, over to 18 AWS services as destinations
When is Multi-Part upload recommended?
For anything greater than 100MB
Mandatory for anything over 5GB
S3 Transfer Acceleration (upload/download)
Increase transfer speed by transferring file to AWS edge location which forwards the data to S3 bucket in the target region
How does S3 Transfer Acceleration work?
If your file is in USA, you transfer it publicly (www) to an Edge Location in USA and then you send it to the S3 Bucket in AUS via private AWS.
Meaning you decrease the amount of public internet you go through
S3 Byte-Range Fetches
Parallelise GETs by requesting specific byte ranges
Better resilience in case of failures
S3 Byte-Range Fetches architecture
You have a file in S3, you break it down into parts and request these parts in parallel - speeds up download
What are the 2 use cases for Byte-Range Fetches?
Speed up downloads
Retrieve only partial data (head of a file for example)
S3 Select & Glacier Select
Retrieve less data using SQL - server side filtering
Filter rows and columns
Less network transfer, less CPU cost client-side
How does S3 Select & Glacier Select work?
Instead of receiving the whole file and then filtering it on client-side, S3 Select does the filtering on the S3 and then sends you exactly what you need.
SQL is used
S3 Select & Glacier Select architecture
You ask for a CSV with S3 Select. Amazon S3 does server-side filtering, then sends the filtered dataset to you
S3 Batch Operations
Perform bulk operations on existing S3 objects with a single request
S3 Batch Operations Examples
Modify object metadata & properties
Copy objects between S3 buckets
Encrypt un-encrypted objects
Modify ACLs, tags
Restore objects from S3 Glacier
Invoke Lambda Function to perform custom action on each object
What do S3 Batch Operations manage?
Retries, track progress, sends completion notifications, generate reports
How can you use S3 inventory?
To get object list and use S3 Select to filter your objects
S3 Inventory use (architecture)
S3 inventory sends Objects list report to S3 select, then S3 select sends filtered list to S3 Batch Operations.
The user can also add operations & parameters, and then everything gets sent from S3 Batch Operations to Processed Objects