Amazon S3 Flashcards
What are objects in S3?
Objects in S3 are file-like entities that contain data. They represent data and not infrastructure, which is what S3 buckets are.
Objects are stored in buckets.
What is S3 (Simple Storage Service)?
S3 is an object-based storage service that is kind of unlimited and serverless, meaning the underlying infrastructure is managed by AWS. The S3 Console provides an interface to upload or access data.
What does an S3 object consist of?
An S3 object may consist of:
Key: name of the object
Value: data stored
Version ID: version of the object (if versioning is enabled)
Metadata: additional information
What are Etags in S3?
Etags are entity tags used for detecting whether a change to a file has been made without downloading. They can also be used for checking data integrity and are typically represented by an MD5 hash.
When are Etags returned in S3?
Etags are returned on:
PUT: upload, including multipart or copy
GET: download, list (ETag is not included in the response body when listing objects)
HEAD: fetching metadata without downloading the file
How can Etags be used in combination with conditional requests?
Etags can be used for caching with If-None-Match and for synchronization with If-Match.
What is the purpose of checksums in S3?
Checksums are used to ensure the data hasn’t become corrupted in transit.
What are S3 Object prefixes?
S3 Object prefixes are part of the object key name. They help organize, group, and filter data.
What are S3 buckets?
Buckets hold objects or folders (which are not true folders) that store objects. Each bucket must have a unique name, is region-specific, and represents infrastructure.
What are the key rules for S3 bucket naming?
- Length: 3-63 characters long
- Characters: lowercase letters, numbers, dots (.), and hyphens (-)
- Start and End: must begin and end with a letter or number
- No adjacent periods
- Cannot be formatted as IP addresses
- No uppercase letters, underscores, spaces, or special characters like “@” or “$”
What are the S3 bucket restrictions and limitations?
- Up to 100 buckets, 1000 after a service request
- Buckets must be empty before deletion
- No max bucket size or limit to the number of objects
- Files must be between 0 and 5 TBs (multipart upload recommended for files over 100MB)
- Specific limits for S3 on AWS Outposts
What are the two types of S3 buckets?
Hallmarks in comparison.
- General Purpose: flat hierarchy, all storage classes except S3 Express One Zone, recommended for most use cases, no prefix limits, 100 per account
- Directory: folder hierarchy, only S3 Express One Zone storage type, recommended for single-digit millisecond performance on PUT and GET, no prefix limits, 10 per account
What are the characteristics of S3 general purpose bucket folders?
- Do not have true folders
- Creating a folder creates a zero-byte object ending in a forward slash (e.g., myfolder/)
- Files in a folder have names appended with the folder prefix
What is unique about S3 folders?
- They are S3 objects and not independent entities
- Do not include permissions or metadata
- Can’t be empty or full
- Aren’t moved; objects with the same prefix are renamed when moved
What happens to the prefix when moving an S3 object to another folder?
Only the prefix is changed; the object itself is not physically moved.
What is metadata in S3?
Metadata provides information about other data but not the content itself. It is useful for categorizing, organizing data, and providing context about data.
When can we attach metadata to S3 Objects?
Metadata can be attached to S3 Objects at any time.
What are the types of metadata in S3?
Metadata can be either system-defined or user-defined.
Who sets system-defined metadata in S3?
System-defined metadata is set by Amazon (with some exceptions).
How must user-defined metadata be formatted in S3?
User-defined metadata must begin with “x-amz-meta-“. When using the AWS CLI, it should be set as “key=value”, which will automatically be appended with the “x-amz-meta-“ prefix.
What does WORM stand for and what does it mean?
WORM stands for Write Once Read Many, meaning the data is immutable and cannot be modified or deleted.
What is Object Lock in S3?
Object Lock prevents deletions of objects in a bucket. It can only be enabled on bucket creation and is useful for data integrity and regulatory compliance.
What are the two types of retention in Object Lock?
Retention period: fixed time
Legal hold: until removed
What are the two request styles in S3?
Virtual hosted-style requests: the bucket name is a subdomain on the host.
Path-style requests: the bucket name is in the request path.
What will happen to path-style requests in S3?
Path-style requests will be discontinued, and some features work only with virtual hosted-style requests.
What is S3 Standard storage class?
S3 Standard is the default storage class, designed for general-purpose storage for frequently accessed data.
What are the key features of S3 Standard?
How durable it is etc.
- High Durability: 11 9’s of durability (99.999999999%)
- High Availability: 4 9’s of availability (99.99%)
- Data Redundancy: Data stored in 3 or more AZs
- Retrieval Time: within milliseconds
- High Throughput: optimized for frequently accessed and/or real-time access data
- Scalability: easily scales to storage size and number of requests
- No retrieval fee
- No minimum storage duration charge
What is S3 Reduced Redundancy Storage (RRS)?
S3 Reduced Redundancy Storage is a legacy storage class for non-critical reproducible data with lower redundancy than S3 Standard. It provides no cost-benefit and is no longer cost-effective but still available for legacy customers.
What are the S3 storage classes sorted by price from highest to lowest?
S3 Standard
S3 Intelligent Tiering
S3 Express One-Zone
S3 Standard-IA (Infrequent Access)
S3 One-Zone-IA
S3 Glacier Instant Retrieval
S3 Glacier Flexible Retrieval
S3 Glacier Deep Archive
What is unique about S3 Intelligent Tiering?
S3 Intelligent Tiering uses AI to determine the storage class and has an extra fee for analysis.
What is S3 Glacier Instant Retrieval?
S3 Glacier Instant Retrieval is designed for long-term cold storage with instant retrieval.
What are the retrieval options for S3 Glacier Flexible Retrieval?
Standard Retrieval: Typically takes 3-5 hours
Expedited Retrieval: Typically takes 1-5 minutes
Bulk Retrieval: Typically takes 5-12 hours
What is S3 Express One Zone designed for?
S3 Express One Zone is made for consistent single-digit millisecond data access, best suited for frequently accessed data and latency-sensitive applications.
What are the key features of S3 Express One Zone?
The lowest latency available
Access speed up to 10x faster than Standard
Request costs 50% lower than Standard
Data is stored in a single AZ chosen by the user
Data is stored in an S3 Directory Bucket
How are request costs structured for S3 Express One Zone?
For request sizes up to 512 KB, there is a flat per request charge. For portions of requests greater than 512 KB, there are additional per GB charges for PUT and GET operations.
What is S3 One Zone-IA designed for?
S3 One Zone-IA is designed for less frequently accessed data with reduced availability.
What are the key features of S3 One Zone-IA?
How durable it is etc?
- High Durability: 11 9’s of durability (99.999999999%)
- Lower Availability: 99.5% because it is in one AZ
- Cost-Effective Storage: costs 20% less than Standard-IA (which is 50% cheaper than Standard)
- Data Redundancy: Risk of data loss due to storing in only one AZ
- Retrieval Time: within milliseconds
- Use Cases: Secondary backup copies of on-premise data, for recreating data in case of AZ failure. Not frequently accessed, non-mission-critical data.
- Pricing:
- Storage per GB
- Per Request
- Minimum storage duration charge (30 days)
- Retrieval fee
What is S3 Glacier Instant Retrieval designed for?
S3 Glacier Instant Retrieval is designed for rarely accessed data that needs instant access.
What are the key features of S3 Glacier Instant Retrieval?
How durable it is etc?
- High Durability: 11 9’s of durability (99.999999999%) like Standard
- High Availability: 3 9’s of availability (99.9%) like Standard-IA
- Cost-Effective Storage: 68% lower cost than Standard-IA if data is long-lived and accessed once per quarter
- Retrieval Time: within milliseconds
- Use Cases: Rarely accessed data that needs instant access, such as image hosting, online file-sharing apps, medical imaging and health records, news media assets, satellite and aerial imaging.
- Pricing:
- Storage per GB
- Per Request
- Minimum storage duration charge (90 days)
- Retrieval fee
What is S3 Glacier Flexible Retrieval?
S3 Glacier Flexible Retrieval combines S3 and Glacier into a single set of APIs, providing faster retrieval than Vault-based S3 Glacier.
What are the retrieval tiers for S3 Glacier Flexible Retrieval?
Expedited: 1-5 minutes, urgent requests, limited to 250 MB archive size
Standard: 3-5 hours, default option, no archive limit
Bulk: 5-12 hours, no archive limit
What are the additional costs associated with S3 Glacier Flexible Retrieval?
Separate costs from the cost of storage, including per GB and per request charges. Archived objects are expanded by an additional 40KBs (32KBs for index and metadata, 8KBs for object name).
What is S3 Glacier Deep Archive?
S3 Glacier Deep Archive combines Amazon S3 and Amazon S3 Glacier into a single set of APIs, offering more cost-effective storage than Glacier Flexible Retrieval but with higher retrieval costs.