S3 Flashcards
AWS Bucket
Container for objects
Store unlimited number of objects in bucket
S3 key, S3 value
Name of the file
Binary of file
URL Pattern to access S3 website
Depending on your Region, your Amazon S3 website endpoint follows one of these two formats.
s3-website dash (-) Region
http://bucket-name.s3-website-Region.amazonaws.com
s3-website dot (.) Region
http://bucket-name.s3-website.Region.amazonaws.com
What does an S3 object consist of?
Key
Value
Version ID
Metadata
Sub resources?
Access control information
S3 Gateway Endpoint
Ec2 instances connecting to S3 using private addresses
Used for private subnet EC2s that want to communicate with S3
File Storage vs Object Storage
File Share
-data stored in directories
-can have heirarchy of directories
-file systems are mounted to OS (drive name in Windows)
-functions like local storage
-network connection is maintained; don’t need to remount
Object Store
-data stored in buckets
-flat namespace, no heirarchy
-hierarchy can be mimiced with prefixes (e.g. prefix in the object key name)
-accessed via REST API
-network connection reset with each request/complete after each request
Durability
S3 Durability offers how many 9’s
Protection against data loss and data corruption
11 9’s 99.999999999
Availability
S3 Availability offers how many 9’s
a measurement of the amount of time the data is available to you
expressed as percentage of time per year
4 9’s, 99.99%
What are the S3 storage classes and where are they set?
Standard
S3 Intelligent Tiering
Standard IA
One Zone IA
Glacier Instant Retrieval
Glacier Flexible Retrieval
Glacier Deep Archive
Storage class applies to objects
What storage class doesn’t offers a different number of 9’s for availability?
One Zone-IA offers 99.5% availability,
S3 Intelligent Tiering, Standard IA, Glacier Instant Retrieval offers 99.9%
while all others offer 4 9’s
Which storage class has retrieval fees?
How is the retrieval fee measured?
Standard IA
One Zone IA
Glacier Instant Retrieval
Glacier Flexible Retrieval
Glacier Deep Archive
Per GB retrieved
Which storage classes have a minimum storage duration charge?
What is the minimum for each?
Standard IA - 30 days
One Zone IA - 30 days
Glacier Instant Retrieval - 90 days
Glacier Flexible Retrieval - 90 days
Glacier Deep Archive - 180 days
Which storage classes has a minimum capacity charge per object?
What is the minimum for each?
Standard IA - 128KB
One Zone IA - 128KB
Glacier Instant Retrieval - 128KB
Glacier Flexible Retrieval - 40KB
Glacier Deep Archive - 40KB
How many AZ’s is the data replicated in for each storage class?
All 3 AZ’s except for One Zone IA which is only one AZ
What is the Availability SLA for each storage class?
Standard - 99.9%
S3 Intelligent Tiering - 99%
Standard IA- 99%
One Zone IA- 99%
Glacier Instant Retrieval- 99%
Glacier Flexible Retrieval - 99.9%
Glacier Deep Archive - 99.9%
S3 Standard Storage class
Default storage class
for general purpose storage of frequently accessed data.
S3 Intelligent Tiering Storage class
Automatically move data between different storage classes based on how you are using data for cost and performance
the only storage class that delivers automatic storage cost savings when data access patterns change, without performance impact or operational overhead.
Data with unknown or changing access patterns. Milliseconds to access
an additional storage class that provides flexibility for data with unknown or changing
access patterns. It automates the movement of your objects between storage classes to optimize cost.
Standard IA Storage class
For infrequently accessed data, lower cost for data storage but fee for data retrieval and minimum storage duration and capacity charge
Milliseconds to
access
One Zone IA Storage class
For infrequently accessed data only stored in one AZ
Milliseconds to
access
Glacier Instant Retrieval Storage class
Storage class with the best access to access archival data;
Milliseconds to restore/access
Glacier Flexible Retrieval Storage class
Lesser need to access archival data;
access data within minutes to hours (not seconds), lowest minimum capacity charge per object where this applies
12 hours or less to restore
Glacier Deep Archive Storage class
Don’t need to access archival data
access data within hours (not seconds or minutes) and longest minimum storage
12 hours or less to restore
Amazon Glacier Storage classes
used for archival data so you can store at a much lower costs for longer time
Bucket Policy
resource based policies, only attached to S3 buckets
resource specifies bucket
principal specifies user, group, or role
Action is an S3 action
S3 ACL
Access Control List
Legacy access control mechanism that predates IAM
AWS recommends using S3 bucket policies or IAM policies rather than ACLs
ACLs can be attached to the bucket or an object
Limited options for grantees and permissions
When should you use IAM policies vs Bucket policies to control access to S3?
IAM policy if:
-YOu need to control access to AWS services other than S3
-You have numerous buckets each with different permission requirements
-Prefer IAM policy
S3 bucket policy if:
-You need simple way to grant cross account access to S3 env without IAM roles
-Your IAM policies are reaching the size limit
-Prefer bucket policies
S3 Versioning
a means to keep multiple variants of an object in the same bucket
Use versioining to preserve, retrieve and restore every version of every object stored in your buckett
What do version enabled buckets allow?
recovery of objects from accidental deletion or overwrite
What are the two forms of S3 Replication?
Features, contraints?
Cross Region Replication (CRR)
Same Region Replication (SRR)
can be same or different accounts
buckets must have versioning enabled to use replication
Cross Region Replication
Any data that is written in the original region/bucket is written to the region/bucket configured within CRR
a bucket-level configuration
Same Region Replication (SRR)
Any data that is written in the original region/bucket is written to the same region and different bucket configured within SRR
can be same or different accounts
AWS S3 Lifecycle Management
Transtion Actions - Defines when objects transition to another storage class
Expiration actions - Defines when objects expire/are deleted by S3
S3 Lifecycle Transitions
Standard to any other
Any storage class to S3 Glacier or S3 Glacier Deep Archive
Standard IA to Intelligent Tiering or One Zone IA
Intelligent Tiering to One Zone IA
What transitions are not allowed
Any storage class to Standard, Reduced Redundancy
Intelligence Tiering to Standard IA
One Zone IA to Standard IA or Intelligent Tiering
What S3 operations can you add MFA to?
Changing the versioning state of a bucket
Permanently deleting an object version
What are factors of authentication in MFA for S3?
username/password
token generated by HW or SW program
What is required for MFA to be configured on bucket?
What must be included in request for MFA operations on a bucket?
versioning
The x-amz-mfa request header must be included in the request
Who can enable MFA delete?
bucket owner (root account)
SSE-S3
Server-side encryption with S3 managed keys uses:
S3 managed keys
Unique object keys
Master key
AES 256
Encryption/decryption happens on AWS side
Secured at rest via SSE-S3 and in transit via TLS
What type of encryption is offered with S3?
SSE-KMS: Server side encryption with AWS KMS
SSE-S3: Server-side encryption with S3 managed keys
SSE-C: Server side encryption with client provided keys
S3 Client Side encryption
S3 Default Encryption
SSE-C
Server side encryption with client provided keys
Client managed keys
Not stored on AWS
S3 Client Side encryption
Client managed keys
Not stored on AWS or you can use KMS keys
Encryption/decryption on the client side, not in AWS
AWS only sees the encrypted object and can not encrypt/decryt
S3 Default Encryption
All Amazon S3 buckets have encryption configured by default
All new object uploads to Amazon S3 are automatically encrypted
There is no additional cost and no impact on performance
Objects are automatically encrypted by using server side encryption with Amazon S3 managed keys (SSE-S3)
SSE-KMS
Server side encryption with AWS KMS managed keys uses
KMS managed keys
Can be AWS managed keys or customer managed KMS keys
Encryption/decryption happens on AWS side
Secured at rest via SSE-KMS and in transit via TLS
Can you encrypt unencrypted Amazon S3 objects? If so how, if not why?
To encrypt existing unencrypted S3 objects, you can use S3 batch operations
You can also encrypt existing objects using the CopyObject API operation or the copy-object AWS CLI command
How can you enforce encryption with bucket policy
You can force the type of SSE using condition
“s3:x-amz-server-side-encryption”: [true | type of encryption]
S3 Multipart Upload
Multipart upload uploads objects in part independently in parallel and in any order
Performed using the S3 Multipart upload API
It is recommended for objects 100 MB and larger
Can be used for objects from 5 MB to 5 TB
Must be used for objects larger than 5 GB
S3 Transfer Acceleration
Uses CloudFront edge locations to improve performance of transfers from client to S3 bucket
Upload file to CloudFront and then the content traverses the AWS global infrastructure to get to the bucket
Endpoint is different
http://[bucketname].s3-accelerate.amazonaws.com
http://[bucketname].s3-accelerate.dualstack.amazonaws.com
Only charge for additional acceleration if there is a performance improvement
Enable transfer acceleration at the bucket level
Once enabled it can’t be disabled only suspended
Uses anycast packets
S3 Select
Glacier Select
Use SQL expressions to access the objects within buckets or objects within objects (e.g. zip)
Server Access Logging
Provides detailed records for the request that are made to a bucket
Details include requester, bucket name, request time, request action, response status, and error code (if applicable)
Disabled by default
Must configure a separate bucket as the destination (can specify a prefix)
Need to grant write permissions to the S3 log delivery group on destination bucket
Need to enable logging to specified bucket (should be different than bucket to avoid endless loop)
CORS with Amazon S3
Cross Origin Resource Sharing (CORS)
Allow request from an origin to another origin
Origin is defined as DNS name, protocol, port
Must add CORS configuration to bucket to allow requests from the other origin
How do you enable CORS with S3
Enable through settings:
-Access-Control-Allow-Origin
-Access-Control-Allow-Methods
-Access-Control-Allow-Headers
These settings are defined using rules
Rules are added by using JSON files in S3
S3 Object Lambda
S3 Object lambda uses lambda functions to proces sthe output of S3 GET Request
You can use your own functions or use th AWS pre-built functions
S3 Object Lambda - Prebuilt Functions
Prebuilt lambda functions that detect PII
PII includes names, addresses, dates, credit card numbers, SSN
PII Access Control - detects PII and restricts access
PII Redaction - detects PII and returns document with the PII redacted
Decompression - decrypts objects compressed with Bzip2, gzip, snappy, zlib, zstandard, and Zip
All functions have ARNs that can be referenced to use these functions
Limit to S3 file size
up to 5 TB
S3 namespace
universal namespace so bucket names must be unique globally
Where are buckets created and where should you create them as a best practice?
Regions
create buckets in region that are physically closest to your users to reduce latency
Limit on number of buckets per account
By default 100 buckets
S3 Event Notifications
Sends notifications when events happen in buckets
No polling needed
Destination include:
SNS topics
SQS queues
Lambda
What are the cross account access methods for S3?
Resourced based policies and IAM policies for programmatic only access to S3 bucket objects
Resourced based ACL and IAM policies for programmatic only access to S3 bucket objects
Cross account IAM roles for programmatic and console access to S3 bucket objects
What are S3 Performance Optimization methods
Se supports at least 3500 PUT, COPY, POST, DELETE OR 5500 GET/HEAD requests per second per prefix in a bucket
Increase read or write performance by parallelizing reads
Use byte-range fetches
Retry request for latency sensitive applications
Combine S3 storage and EC2 compute in the same region
Use S3 Transfer Acceleration to minimize latency caused by distance
AWS S3 Event Notification
receive notifications when certain object events happen in your
bucket
you no longer need to build or maintain server based
polling infrastructure to check for object changes
Amazon S3 can send event notification messages to the following destinations?
*
Amazon Simple Notification Service (Amazon SNS) topics
*
Amazon Simple Queue Service (Amazon SQS) queues
*
AWS Lambda functions
S3 Consistency?
Strong read-after-write consistency
Amazon S3 provides strong read-after-write consistency for PUT and DELETE requests of objects in your Amazon S3 bucket in all AWS Regions
behavior applies to both writes to new objects as well as PUT requests that overwrite existing objects and DELETE requests
What you write is what you will read
Eventual Consistency
Bucket configurations have an eventual consistency model. Specifically, this means that:
If you delete a bucket and immediately list all buckets, the deleted bucket might still appear in the list.
If you enable versioning on a bucket for the first time, it might take a short amount of time for the change to be fully propagated. We recommend that you wait for 15 minutes after enabling versioning before issuing write operations (PUT or DELETE requests) on objects in the bucket.
Strongly consistent : Read operations
pre-signed URL
the S3 object owner can optionally share objects with others by creating a pre-signed URL, using their own security credentials, to grant time-limited permission to download the objects
what do you specify to generate a pre-signed URL
When you create a pre-signed URL for your object, you must provide your security credentials, specify a bucket name, an object key, specify the HTTP method (GET to download the object) and expiration date and time.