S3/Glacier Flashcards
What is S3?
Simple Storage Service. It provides object based (blob) storage. It was one of the first AWS services introduced back in 2006.
Why would I use S3?
Analytics: Data lakes (Athena, Redshift Spectrum, QuickSight), IoT Streaming Data repository (Kinesis Firehose), AI/ML Storage (Rekognition, Lex, MXNet) , Storage class analysis (S3 Mgmt analytics)
Static Web Hosting: simple and massively scalable static website hosting
BitTorrent: use the BitTorrent protocol to retrieve any publicly available object by automatically generating a .torrent file
What is the largest S3 object size?
5TB
What is the largest object in a single PUT?
5GB
When is it recommended to use multi-part uploads?
If yourfile is larger than 100MB
What is a key?
Is NOT a file path though it looks like one pointing to an object store. It’s the name of the record in the file store
What are S3 storage classes and what is the purpose of each?
- Standard: frequently accessed data
- Standard-IA: long-lived, infrequently access data
- One Zone-IA: long-lived, infrequently accessed, non-critical data
- Reduced redundancy: frequently accessed, non-critical data
- Intelligent-tiering; long-lived data with changing or unknown access patterns
- –
i. Glacier: long-term data archiving with retrieval times ranging from minutes to hours
ii. Glacier Deep Archive: long-term data archiving with retrieval times within 12 hours
What is intelligent tiering?
Moves files to the next tier based on data type/usage
Does intelligent tiering add to the cost?
Yes, it adds some but you will save money due to changing to lower tiers
What is intelligent tiering archive?
Automatically moves data to Glacier or DeepGlacier after a certain period of time. This is NOT lifecycle management
What is S3 Lifecycle Management?
It moves data to other storage classes at a set time.
What are the benefits of S3 Lifecycle Management?
- Optimize storage costs
- Adhere to data retention policies using automation
- Helps keep S3 volumes well-maintained
- Data destruction is one of the more difficult tasks and this helps provide clarity and enforcement
What are the Lifecycle rules based on?
- Prefixes
- Tags
- Current vs previous versions
What is storage class analysis?
A useful tool within S3 that helps you ensure you are using your storage in the most cost-effective manner. You can run reports to view the frequency at which data is accessed and then potentially change storage type.
What is the “Requester Pays” cost option?
The requester rather than the bucket owner pays for requests AND data transfer
Does S3 support tagging?
Yes, assign tags to objects for use in costing, billing, security etc
What is an S3 Event?
This occurs when an action is taken on a bucket or object and triggers notifications to SNS, SQS, or Lambda