S3 - Simple Storage Service Flashcards
What is S3?
Amazon Simple Storage Service is an object storage service that offers industry-leading scalability, data availability, security, and performance
Where are objects stored in S3?
in buckets
What is globally unique on a bucket?
the name
How are buckets scoped?
regionally
What is the number of characters allowed in bucket’s name?
3-63
What can not contain a bucket’s name?
no uppercase nor underscore
How must start a bucket’s name?
lowercase or number
What is the bucket object key?
the full path, starting after the bucket name
What is composed of the key of a bucket object?
prefix + object name
There are directories within buckets?
There’s no concept of “directories” within buckets
(although the UI will trick you to think otherwise)
Just keys with very long names that contain slashes (“/”)
What is the max object size in S3?
5TB
What is the max object size you can upload to S3?
5 GB
What you need to do to upload an object greater than 5GB to S3?
use multi-part upload
What contains an S3 object?
Key Version ID Value (Object itself) Metadata Subresources Access Control Information
What is useful for S3 object tags?
useful for security / lifecycle
How many S3 object tags can you use?
up to 10
Can you enable versioning on an S3 object?
no, it is at bucket level
How can you increment an S3 object version?
uploading an object with the same key
What is the version number of a file that was not versioned prior to when versioning is enabled?
null
What happens to previous versions when versioning is disabled?
nothing, they are not deleted
Which are the 4 methods of encrypting objects in S3?
SSE-S3
SSE-KMS
SSE-C
Client Side Encryption
What is about SSE-S3 encryption method in S3?
encrypts S3 objects using keys handled & managed by AWS
What is about SSE-KMS encryption method in S3?
leverage AWS Key Management Service to manage encryption keys
What is about SSE-C encryption method in S3?
when you want to manage your own encryption keys
What is about Client Side Encryption method in S3?
Customer fully manages the keys and encryption cycle
What encryption type is used by SSE-S3 encryption method?
AES-256
What you must set to use S3 SSE-S3 encryption method?
Must set header: “x-amz-server-side-encryption”: “AES256”
What is used by SSE-KMS S3 encryption method?
A Customer Master Key (CMK)
What you must set to use S3 SSE-KMS encryption method?
Must set header: “x-amz-server-side-encryption”: ”aws:kms”
What means SSE on S3 encryption methods?
Server Side Encryption
What you must set to use S3 SSE-C encryption method?
You must provide the key via HTTPS only
What you must do to use Client Side encryption method?
You must encrypt and decrypt the data by yourself before sending it or receiving it using a client library such as the Amazon S3 Encryption Client
What endpoints are exposed by S3?
HTTP and HTTPS (recomended)
What are the 2 base groups for S3 security?
User and Resource based
What is the User Based security on S3?
IAM policies - which API calls should be allowed for a specific user from IAM console
What are the Resource Based security on S3?
- Bucket Policies - bucket wide rules from the S3 console - allows cross account
- Object Access Control List (ACL) – finer grain
- Bucket Access Control List (ACL) – less common
How are S3 Bucket policies written?
JSON
What you must define on a S3 Bucket policy?
- Resources
- Actions
- Effect
- Principal
What means the actions in a S3 Bucket policy?
Set of API to Allow or Deny (s3:GetObject)
What means a resource in a S3 Bucket policy?
buckets and objects
What means an effect in a S3 Bucket policy?
Allow / Deny
What means a principal in a S3 Bucket policy?
The account or user to apply the policy to
How can you grant access to another account to your bucket?
Using a Bucket Policy
How can you grant public access to your bucket?
Using a Bucket Policy
What can you use to block any permission allowed over your bucket?
Use Bucket settings for Block Public Access
At what level does work Block Public Access setting?
At bucket and account level
How can you access private S3 instances without internet?
S3 supports VPC endpoints
Where can you store S3 access logs?
In another S3 bucket
Where can be logged S3 API calls?
CloudTrail
What can you use in order to prevent the deletion of any versioned S3 objects?
Use MFA Delete in your bucket, versioning must be enabled
How can you share an S3 object with an external user?
Pre-Signed URLs (valid only for a limited time)
Where can you host a static website and make it accessible on the www?
S3
What is reflected in the S3 URL of a static website hosted there?
bucket name and region
What if you get a 403 (Forbidden) error from a static web site deployed on S3?
make sure the bucket policy allows public reads
How to configure your bucket to allow cross-origin requests?
create a CORS configuration, which is an XML document with rules that identify the origins that you will allow to access your bucket, the operations (HTTP methods) that will support for each origin, and other operation-specific information.
you can also allow all origins using *
How is in S3 Read after write consistency for PUTS of new objects?
As soon as a new object is written, we can retrieve it
ex: (PUT 200 => GET 200)
except if we did a GET before to see if the object existed ex: (GET 404 => PUT 200 => GET 404) – eventually consistent
How is in S3 Read after updating an object consistency?
If we read an object after updating, we might get the older version ex: (PUT 200 => PUT 200 => GET 200 (might be older version))
How is in S3 Read after deleting an object consistency?
If we delete an object, we might still be able to retrieve it for a short time
ex: (DELETE 200 => GET 200)
How can I request S3 strong consistency?
there is no way to request
What you need to use MFA-Delete on S3?
to have versioning enabled in the bucket
When you will need MFA on S3?
- permanently delete an object version
* suspend versioning on the bucket
Who can enable/disable MFA-Delete?
Only the bucket owner (root account)
How can you enable MFA-Delete?
only by using the CLI
What is evaluated before S3 default encryption?
Bucket policies, it was the old way to enable default encryption
What you should not use as your logging bucket?
your monitored bucket, it will create a logging loop, and your bucket will grow exponentially
What condition must be accomplished by 2 buckets involved in S3 replication?
Both must enable versioning
Can you set S3 replications cross accounts?
yes
How is data copied in S3 replication?
async, but it is very quick
What condition must be accomplished by the bucket containing the data in S3 replication?
Must have proper IAM permission behind an IAM Role
What happens to the objects when you activate S3 replication?
new objects are replicated, it is not retroactive
What happens when you have S3 replication and you delete an object version
it is not replicated
It is S3 replication transitive?
No
How can you generate S3 pre-signed URLs?
using SDK or CLI
What is the default expiration time of S3 pre-signed URLs?
3600 s
What are the permissions of the person that a S3 Pre-signed URL was given to?
They inherit the permissions of the person who generated the URL for GET / PUT
What you must use to create a S3 Pre-signed URL for uploads?
SDK
What are the S3 Storage classes?
- Amazon S3 Standard - General Purpose
- Amazon S3 Standard-Infrequent Access (IA)
- Amazon S3 One Zone-Infrequent Access
- Amazon S3 Intelligent Tiering
- Amazon Glacier
- Amazon Glacier Deep Archive
How are files called in S3 Glacier and where are stored?
Archives and stored in Vaults
What you need to pay for using S3 Intelligent Tiering Storage Class?
Small montly monitoring and auto-tiering fee
Which are the retrieval options for S3 Amazon Glacier?
- Expedited
- Standard
- Bulk
What is the time to get the data for Amazon Glacier Expedited?
1 - 5 min
What is the time to get the data for Amazon Glacier Standard?
3 - 5 hours
What is the time to get the data for Amazon Glacier Bulk?
5 - 12 hours
What is the time to get the data for Amazon Glacier Deep Archive Standard?
12 hours
What is the time to get the data for Amazon Glacier Deep Archive Bulk?
48 hours
Which are the retrieval options for S3 Amazon Glacier Deep Archive?
- Standard
- Bulk
Which is the minimum storage duration for S3 Amazon Glacier?
90 days
Which is the minimum storage duration for S3 Amazon Glacier Deep Archive?
180 days
What is S3 Lifecycle Configuration?
a set of rules that define actions that Amazon S3 applies to a group of objects to manage your objects so that they are stored cost effectively
What are the 2 types of actions in S3 Lifecycle Configuration?
- Transition Actions
- Expiration Actions
What is S3 lifecycle configuration transition actions?
It defines when objects are transitioned to another storage class. • Move objects to Standard IA class 60 days after creation • Move to Glacier for archiving after 6 months
What is S3 lifecycle configuration expiration actions?
configure objects to expire (delete) after some time
• Access log files can be set to delete after a 365 days
• Can be used to delete old versions of files (if versioning is enabled)
• Can be used to delete incomplete multi-part uploads
What can you use to apply S3 lifecycle configuration actions?
prefixes and tags
What is the max amount of prefixes allowed in a bucket?
no limit
What encryption method might impact your S3 performance baseline in extreme performance scenarios?
SSE-KMS becuse of their quota
When is recommended to use S3 multi-part upload?
recommended for > 100MB because it parallelizes the uploads
What is S3 Transfer Acceleration?
Increase transfer speed (just uploads) by transferring file to an AWS edge location which will forward the data to the S3 bucket in the target region. It is compatible with multi-part upload
What can you use to accelerate your upoads to S3?
S3 Transfer Acceleration
What can you use to accelerate your downloads from S3?
S3 byte-range fetches
What is S3 byte-range fetches?
parallelize GETs by requesting specific byte ranges
What can you use to request just for the header of a file in S3?
S3 byte-range fetches
What is S3 Select and Glacier Select?
enables applications to retrieve only a subset of data from an object by using simple SQL expressions
What are the advantages of using S3 Select?
Less network transfer and less CPU cost client side
Set an example of two S3 events
S3:ObjectCreated
S3:ObjectRemoved
In what time are S3 Event notifications delivered?
Typically in seconds but can sometimes take a minute or longer
What are S3 event notifications?
The Amazon S3 notification feature enables you to receive notifications when certain events happen in your bucket
What can you do to ensure that an event notification is sent for every successful write?
you can enable versioning on your bucket.
What are the destinations supported by S3 event notification?
SNS
SQS
Lambda Functions
What is Athena?
Serverless service to perform analytics directly against S3 files
What language is used by Athena?
SQL
What is the exam tip for Athena?
Analyze data direclty on S3
How can you connect externally to Athena
Using a JDBC / ODBS driver
What format(s) supports Athena?
A lot (CSV, JSON, ORC, Avro, and Parquet (built on Presto))
How are you charged in Athena?
per query and amount of data scanned
What is S3 Object lock?
feature that blocks object version deletion during a customer-defined retention period
What is S3 Glacier Vault lock?
allows you to lock your vault
What is the model adopted by S3 Object Lock and S3 Glacier Vault Lock?
write-once-read-many (WORM)
What is great for S3 Cross Region Replication?
Great for dynamic content that needs to be available at low-latency in few regions
Which is the minimum storage duration for S3 Standard IA?
30 days
How can you mount a file system in S3?
you can’t
Can you move data directly to Galcier Deep Archive from any other tier?
yes
What is the order of the S3 storage classes?
You can move data from up to down but no otherwise:
- Standard
- Standard IA
- Intelligent Tiering
- One Zone IA
- Glacier
- Glacier Deep Archive
What is S3 baseline performance for reads?
5,500 GET/HEAD requests per second per prefix in a bucket
What is S3 baseline performance for writes?
3,500 PUT/COPY/POST/DELETE requests per second per prefix in a bucket