S3 Flashcards
What S3 stands for?
Simple Storage Service
What is S3 bucket?
It is a folder/container of objects/files
Why the S3 bucket needs to be globally unique?
Because a CNAME will be create such as https://.s3.amazonaws.com and https://.us-east-1.amazonaws.com
What is the structure of the object within S3 bucket?
Key, Value, Version ID, Metadata and sub resources (Access Control Lists and Torrent)
How S3 keeps data consistency?
If you write, you can read immediately. But if you delete/update, you may get the older version.
What is the availability of S3?
99.99% for the S3 platform. 99.9% for Amazon, and 99.999999999 durability for S3 information
What are the different S3 storage classes?
S3 standard: Data is stored in multiple device facilities
S3 Infrequently Accessed: Lower cost than S3 standard, but charges on retrieval fee
S3 One Zone - Infrequently Accesses: Cost effective one zone
S3 - Intelligent Tiering: Optimize costs by automatically moving data to most cost-effective (IA) access tier without performance impact
S3 Glacier: Cheaper than on-premise. Retrieval times from minutes to hours;
S3 Glacier Deep Archive: Lowest cost class where a retrieval time of 12h is acceptable
How S3 is charged?
Storage (volume) Request (traffic) Storage Management (classes) Data Transfer Transfer acceleration Replication
What is the maximum file size?
5TB, but for a single PUT it is 5GB. In case the file size is bigger than 100MB, multipart upload is suggested
What is the limite for S3 storage?
Virtually unlimited
What is Transfer acceleration?
It is when the data is replicated using the AWS backbone (fastest network)
How to protect object deletion?
It is possible to enable MFA for Deletion
What is Access Control List?
Permissions on bucket or object level
Can I have a bucket that has different objects in different storage classes?
Yes, you can have a bucket that has different objects stored in S3 Standard, S3 Intelligent-Tiering, S3 Standard-IA, and S3 One Zone-IA.
How to control access control?
Use Bucket Policies for buckets;
Use Access Control Lists for objects;
How versioning works?
All writes and deletes of an objects are versioned. It is great as backup tool. Once enabled, it can't be disabled, only suspended (disabled for new objects only); The visibility (make it public) needs to be done per object version;
How lifecycle management can be used?
Transition objects/versions from one tier to another based on number of days
What kind of governance S3 has?
Using S3 Object lock, the bucket/object is written, but can’t be deleted. It comes in two modes: governance-users can’t delete or alter unless they have special permissions; compliance mode: not even root can change the object;
How to increase performance when using S3?
Spread objects among multiple prefixes/folders. In general, it accepts 5500 PUT/second.
Use multipart upload: Recommended for 100MB+ files; Mandatory for 5GB+;
S3 byte-range fetches for downloading faster or just a part of the file
What is S3 or Glacier Select?
It allows queries to be done against compressed CSV zip files in S3 with a performance improvement of 400%.
This also allows queries to be executed on data that is stored in Glacier (low cost data analysis)
Are the existing S3 bucket objects replicated when a replication rule is just created?
No. Only new objects versions will be replicated. Interesting to remember that the object can be replicated to a different account with a different storage class.
Replicated object permission changes are not replicated
What is the pre-requisite for enabling S3 replication?
Versioning
What is Snowball?
petabyte-scal data transport solution with 1/5 cost of high-speed internet. It can import/export from/to S3
Snowball vs Snowball edge?
Snowball is a storage with 50TB or 80TB. Snowball edge comes with 100TB and it also provides computing capability. It can be clustered to run workloads even on places without internet access.
What is the avilability for each S3 class?
The S3 Standard storage class is designed for 99.99% availability, the S3 Standard-IA storage class is designed for 99.9% availability, the S3 One Zone-IA storage class is designed for 99.5% availability, and the S3 Glacier and S3 Glacier Deep Archive class are designed for 99.99% availability and SLA of 99.9%.