S3 101 Flashcards
What does S3 stand for?
Simple Storage Service
What is S3 used for?
- S3 provides developers + IT teams w/ secure, durable, highly-scalable object storage.
- retrieve and store any amount of data from anywhere on the web
What type of storage does S3 use?
S3 uses Object-based storage – i.e. allows you to upload files
What size limitations are there for individual S3 objects? What about aggregate limitations?
- S3 files can be from 0 Bytes to 5 TB.
- There is no aggregate limitation
What are S3 buckets? What are they used for?
S3 buckets store files. (Think of them like a file folder)
What type of namespace does S3 use?
S3 uses a universal namespace. That is, names must be globally unique.
When you successfully upload a file to S3, what will you receive back?
an HTTP 200 code
What are the components of an S3 object? What do each of these components represent?
An S3 object consists of the following:
- Key (The name of the object)
- Value (the data, made up of a sequence of bytes)
- Version ID (Important for versioning/version control)
- Metadata (data about data you are storing)
- Subresources (Access Control Lists, Torrent)
How does S3 keep data consistent?
- Read after Write consistency for PUTS of new Objects
- Eventual Consistency for overwrite PUTS and DELETES (takes some time to propagate)
For what % availability was the S3 platform built?
99.99%
What % availability does Amazon guarantee for S3 Standard?
99.9%
What % durability does Amazon guarantee for S3 Standard information?
99.999999999% durability (11 9’s)
What are the key features of S3?
(V MELTS)
- Versioning
- MFA Delete
- Encryption
- Lifecycle Management
- Tiered Storage
- Secure Data using Access Control Lists and Bucket Policies
What are the key features of S3 Standard?
- 99.99% Avail
- 11 9’s Durability
- Stored redundantly across multiple devices in multiple facilities,
- designed to sustain the loss of 2 facilities concurrently
What does the “IA” stand for in S3-IA?
Infrequently Accessed
What type of data is best stored in S3-IA?
S3-IA is best for data that is not accessed frequently, but requires rapid access when needed
What is the pricing structure of S3-IA? Specifically, how does it differ from that of S3 Standard?
- S3-IA has a lower base storage fee than S3 Standard.
- However, S3-IA charges a retrieval fee.
What are the key differences between S3-IA and S3 One Zone - IA?
Compared to S3-IA, S3 One Zone- IA has lower cost but less durability.
- S3 One Zone-IA is a lower-cost option for IA data
- S3 One Zone-IA does not give the multiple Availability Zone resilience of S3 standard and S3 IA.
What is S3 - Intelligent Tiering?
S3 Intelligent tiering uses ML and is designed to optimize costs by automatically moving data to the most cost-effective access tier, without performance impact or operational overhead. (Basically, it’s the autopilot mode for S3 tiering)
What is S3 Glacier primarily used for?
S3 Glacier is mostly used for data archival at low-cost
How long does it take to retrieve something from S3 Glacier?
Retrieval times from S3 Glacier are configurable and range from minutes to hours
What is S3 Glacier Deep Archive?
S3 Glacier Deep Archive is S3’s lowest-cost storage class
How long does it take to retrieve something from S3 Glacier Deep Archive?
S3 Glacier Deep Archive is for cases where a retrieval time of 12 hours is acceptable.
What are the areas on which you are charged for using S3?
- Storage (amount you are storing)
- Requests
- Storage Management Pricing (Tier)
- Data Transfer
- Transfer Acceleration
- Cross - Region Replication
What is Transfer Acceleration?
- Used for fast, easy, secure transfers over long distances between end user and an S3 bucket
- Uses CloudFront’s globally distributed edge locations: as data arrives at an edge location, data is routed to S3 over an optimized network path
What is the format of the DNS name created for an S3 bucket in a specific region?
“http://s3.aws-region.amazonaws.com/bucketName”
OR
“http://bucketname.s3.aws-region.amazonaws.com”
(<a>https://docs.aws.amazon.com/AmazonS3/latest/dev/UsingBucket.html#access-bucket-intro</a>)
What would you use to install an operating system on S3?
S3 is NOT suitable to install an operating system on.
How can I help, at a bucket-configuration level, to protect against someone going in and deleting data from S3?
Turn on MFA Delete
How does the pricing model work for S3 Intelligent Tiering
Very similar to S3 Standard EXCEPT
- you have access to IA which is less expensive
- There is a monitoring / automation cost per thousand objects per month
What are the default access control permissions for newly created buckets?
By default, all newly created buckets are PRIVATE
How can I set up access control to buckets?
- Bucket Policies
- Access Control Lists (object-specific)
How can I set up my S3 bucket to log all requests made to it?
S3 buckets can be configured to create access logs, which log all requests made to the S3 bucket.
What are AWS Organizations?
AWS Organizations is an account management service that enables you to consolidate multiple AWS accounts into an organization that you create and centrally manage.
What is Consolidated Billing? How would I set it up? What are three major advantages to it?
Consolidated Billing is an advantage of AWS Organizations. The three major advantages are:
- One Bill per AWS account
- Very easy to track changes & allocate costs
- Volume Pricing Discount – the more you use, the lower your rate
What are two forms of best practices for a root account with AWS organizations?
- Always enable multi-factor authentication on a root account
- Always use a strong and complex password on a root account
How can I enable/disable AWS services for either the organizational level or on individual accounts?
Enable/Disable AWS services using Service Control Policies (SCPs) either on Organizational Units or on individual accounts
What is the difference between Bucket Policies and Bucket ACLs as it relates to sharing S3 buckets across accounts?
- Bucket Policies apply across the entire bucket
- Bucket ACLs apply to individual objects
Which method of sharing S3 buckets across accounts is the only one that provides both Programatic and Console access?
Cross-account IAM Roles
What are the 3 ways to share S3 buckets across accounts? What level of access does each provide?
- Bucket Policies & IAM – (applies across entire bucket) Programatic Access Only
- Bucket ACLs & IAM (applies to individual objects) Programatic Access Only
- Cross-account IAM Roles. – Programmatic AND Console Access
Does Cross-Region Replication require bucket versioning?
Yes. Cross-region replication requires bucket versioning on both the source and destination buckets.
When performing cross-region replication, what permissions – at the time of creation – are different between the source bucket and the destination bucket?
by default, there are NO differences between the source and replicated buckets
When performing cross-region replication, what files – at the time of creation – are different between the source bucket and the destination bucket?
When using cross-region replication, files in an existing bucket are NOT replicated automatically.
When performing cross-region replication, what discrepancies will there be between the source and replication buckets?
- All file (versions) made before CRR was turned on are not automatically copied at creation
- Delete markers, deleted versions, and deletes of delete markers are NOT replicated
At a high level, how does S3 Transfer Acceleration work?
Instead of uploading directly to a bucket, the user utilizes a distinct (given) URL to upload to an edge location, which then transfers through Amazon Backbone and directly uploads to an S3 bucket
What is AWS DataSync used for?
AWS DataSync is used primarily for moving/copying large amounts of data from on-premises to AWS
With what types of file systems is AWS DataSync compatible?
DataSync is used with NFS- and SMB-compatible file systems
How can you start DataSync replication?
Install the DataSync agent to start the replication
How often is data replication performed by AWS DataSync?
Replication in DataSync can be done hourly, daily, or weekly.
How can AWS DataSync be used to replicate EFS to EFS?
Install the DataSync Agent on an EC2 instance connected to EFS
What does CDN stand for?
Content Delivery Network
What is a Content Delivery Network?
A Content Delivery Network is a system of distributed servers that deliver web content to a user based on the geographic locations of the user, the origin of the webpage, and a content delivery space.
What is an Edge Location?
A location where content is cached. (This is separate from an AWS Region/AZ)
In the context of CloudFront and Content Delivery Networks, what is an origin? What are some examples of origins?
The origin of all the files that the CDN will distribute.
This can be an S3 bucket, an EC2 instance, an Elastic Load Balancer, or Route53
In the context of CloudFront, what is a Distribution?
the name given to the CDN, which consists of a collection of edge locations
Are edge locations read-only?
No. You can write to an edge location too!
What is Amazon CloudFront?
CloudFront can be used to deliver your entire website, including dynamic, static, streaming, and interactive content, using a global network of edge locations.
Why is Amazon CloudFront good for performance?
Geographic Cacheing. Requests for your content are automatically routed to the nearest edge location, so content is delivered with best performance possible.
What does RTMP stand for?
Real-Time Messaging Protocol
What are the 2 types of distributions used for CloudFront?
- Web Distribution - for websites
- RTMP - for media streaming
Can you clear cached objects in an edge location?
Yes, but you will be charged. (Invalidating the Cache)
What is Time To Live?
How long objects are cached in an edge location. This is a configurable amount.
What S3 functionalities would you want to use for restricting content access?
CloudFront Signed URLs and Cookies and S3 Signed URLs
What is the key difference between a CloudFront Signed URL and a CloudFront Signed Cookie?
- A signed URL is for individual files (1 file = 1 URL)
- A signed cookie is for multiple files (1 cookie = multiple URLs)
What can be included in the policy attached to a signed URL or signed cookie?
- URL expiration (how long it is validd)
- IP ranges
- Trusted Signers (which AWS accounts can create signed URLs)
What does OAI stand for?
Origin Access Identity
Describe the process by which you get a CloudFront Signed URL
- Client Authenticates and Authorizes to log in to the application
- Application Uses CloudFront SDK to generate signed URL
- Application Returns Signed URL to client
- Client logs into Cloudfront using signed URL

Can you use S3 Signed Cookies if your origin is in EC2?
No. If your origin is EC2, use CloudFront
What is Amazon Snowball used for?
BIG data Transfers into and out of AWS, including importing to and exporting from S3
What is the idea behind Amazon Storage Gateway?
Connecting an on-premises software app with cloud-based storage for smart storage (mixing cloud storage and on-premises storage).
What does NFS stand for?
Network File System
How does Amazon File Gateway work?
- Form of Amazon Storage Gateway
- Files are stored as S3 buckets
- Files are accessed through an NFS mount point
- Once objects are transferred to S3, they get managed as native S3 objects, and bucket policies apply directly to them.
What are Amazon Stored Volumes? How do they work?
- Store primary data locally, while asynchronously backing it up to AWS
- On-premises applications get low latency access to their entire datasets. AND you get durable, off-site backups
- Data is backed up to S3 in the form of EBS snapshots
What are Amazon Cached Volumes? How do they work?
- Lets you use S3 as your primary data storage while retaining frequently accessed data in your storage gateway.
- Minimize need to scale on-premises storage
- Data is backed up to S3 in the form of EBS snapshots
What are the two types of Amazon Volume Gateways? In what key ways do they differ?
- Storage Volumes let you keep all of your data on premises (thus on-premise data is the primary storage)
- Cached Volumes let you keep your frequently accessed data on-premises (thus S3 is your primary data storage)
How can you bring a system that runs on tapes into Amazon S3?
Use amazon’s Virtual Tape Library (VTL)
What are the three types of Amazon Storage Gateways?
- File Gateway
- Volume Gateway
- Gateway Virtual Tape Library
What is Amazon Athena? What is it commonly used for?
- Athena is an interactive query service that allows you to query data located in S3 using SQL
- Serverless
- Commonly used to analyse log data stored in S3
What does PII stand for?
Personally Identifiable Information
What is Amazon Macie?
- Macie is a security service which uses ML and NLP to discover, classify, and protect sensitive data used in S3
- Can be used to analyze CloudTrail logs for suspicious API activity
- Includes Dashboards, Alerts, Monitoring
- Great for PCI-DSS complicance and preventing Identity Theft
What does KMS stand for?
Key Management Service
What is the availaility of S3-OneZone-IA?
99.50%
How many S3 buckets can I have per account by default?
100
What is the general use case for S3 Transfer Acceleration?
Accelerating uploads to S3
Where can S3 access logs be stored?
S3 Access logs sent to another bucket or even another bucket in another account.
What does TTL stand for?
Time To Live
When using Storage Volumes, how is data backed up to S3?
asynchronously, as EBS snapshots
How can I restore a file if I went to “Actions -> Delete” on it in S3?
Delete the delete marker
By default, are items automatically encrypted when they are stored in S3?
No, Default encryption is NOT enabled by default
By default, is Transfer Acceleration enabled for a newly created S3 bucket?
No, by default, transfer acceleration is suspended in newly-created S3 buckets
When creating a new S3 bucket, what bucket policies does it have by default?
None.
By default, bucket policy does not exist for newly created S3 buckets
By default, is versioning enabled for newly created S3 buckets?
No
What are the S3 bucket policies?
- Versioning
- Server Access Logging
- Static Website Hosting
- Object-Level Logging
- Tags
- Transfer Acceleration
- Events
- Requester Pays
(<a>https://docs.aws.amazon.com/AmazonS3/latest/user-guide/view-bucket-properties.html</a>)
When uploading objects, what prefix must all user-defined metadata have?
**x-amz-meta-**
In the console, after enabling logging on a source bucket, what permission do you need to give the destination bucket to ensure that the logs can be written there?
You don’t have to do anything.
When you enable logging on a bucket, the console both enables logging on the source bucket and adds a grant in the target bucket’s access control list (ACL) granting write permission to the Log Delivery group.
(<a>https://docs.aws.amazon.com/AmazonS3/latest/dev/enable-logging-console.html</a>)