S3 - Simple Storage Service Flashcards
What is S3?
A place to store your files.
- Object based storage
- File size can range from 0 bytes to 5TB
- Unlimited Storage
- Buckets
Not sutable to install OS on
Newly created buckets are private but you can set up acces control via
- Bucket Policies
- Access Control Lists
Do S3 bucket names have to be unique?
Yes. Universally namespace because you’re making a web address. ex: https://britzer.s3.amazon……
S3 Objects Look like:
- Key
- Value
- Version ID
- Metadata
- Subresources
- Access Control Lists
- Torrent
200 HTTP status Code for successful upload
How does data consistency work for S3?
- Read after Write consistency for PUTS -> Immediate access to whatever was uploaded
- Eventual consistency for overwrite PUTS or DELETES -> takes time
S3 Guarantee for lost data
99.99% or 11x9s
S3 Features
- Tiered Storage ****
- Lifecycle Management - Lifecycle Policies
- Automates moving your objects between different storage tiers
- used with versioning
- current/previous versions
- Versioning
- Encryption
- MFA Delete
- Secure Data via Access Control Lists & Bucket Policies
S3 Storage Classes/Tiers
- S3 Standard
- 99.99% avaliable
- 99.9999999999% durable (11 9’s )
- S3 - IA (Infriquently Accessed)
- For data that is accessed less frequently but requires rapid access when needed
- Lower fee than Standard but with a retrieval fee
- S3 - One Zone - IA
- Lower cost
- Lower avaliablility zones
- S3 - Intalligent Tiering
- Optimizes costs by automatically moving data to the most cost effective teir.
- S3 - Glacier
- Low cost
- retrieval times are configureable from mins/hrs
- S3 - Glacier Deep Archive
- Lowest cost
- 12 hour retrieval time

How are you charged for S3?
- Storage
- Num Requests
- Store Management Pricing
- Data Transfer
- Transfer Accel
- Cross region replication
S3 Costs

S3 - Encryption
You can encyrpt at the object level or bucket level.
- In Transit
- HTTPS - SSL
- At Rest
- Encript data being store
- Server Side
- S3 Managed Key - Amazon Managed - SSE-S3
- AWS Key Management Service (SSE-KMS)
- SSE with Customer Provided Key (SSE-C)
- Client Side
- Encript then upload yourself
S3 Versioning
Stores all versions of an object
Once enabled can’t be disabled
Versioning has MFA to delete capability
S3 Lifecycle Management
- Automates moving objects between different storage tiers
- Can be used in conjunction with versioning
- Can be applied to both current and previous versions
S3 Object Lock/Glacier Lock
Governance Mode - Can’t change with special permissions - S3
Compliance mode - Can’t alter even root user for a specific retention period for a specific version -S3
S3 Object lock && Glacier Vault lock=== WORM (write once read many)
S3 Performance
S3 Prefix
mybucketname/folder1/subfolder/myfile.jpg
More prefixs better performance
Multipart uploads
Byte-range fetches -> splitting downloads
S3 Select / Glacier Select
Enables application to retrieve a subset of data using simple SQL expressions.
400% performance increase. Save money
AWS Organizations
Organizations
Allows you to have mutiple AWS accounts and be able to centrally manage them.
Enable disable AWS services using Service Control Policies (SCP)
Consolidated Billing
One consolidated bill per AWS account
easy to track
volume pricing discount
Sharing S3 Buckets
- Using Bucket Policies & IAM (entire bucket) Programmatic
- Bucket ACLs & IAM (individual objects) Programmatic
- Cross-account IAM Roles. Programmatic AND AWS Console.
S3 Cross region Replication (ex: US –> Japan)
- Versioning must be enabled for both the source and destination bucket
- Old files aren’t replicated automatically but…
- All new files will be replicated automattically
- Delete markers are not replicated
- versions and deleting individual versions will not be replicated
S3 Transfer Acceleration
Utilizes the Cloudfront Edge Network to accelerate uploads to S3.
Instead of uploading directly to a bucket you can use a URL to upload to an edge location which will then transfer to your S3.
What is AWS DataSync?
Allows you to move large amounts of data from on-premise into AWS
Used with NFS and SMB compatible file systems
Replication can be done hourly, daily, weekly
CloudFront
Type of Content Delivery Network (CDN) which is a system of distributed server/network, that delivers web pages and other web content to a user based on the geographic locations of the user.
Can be used to deliver your entire website, including dynamic, static and streaming and interactive content using a global network of Edge Locations
Request for your content is automatically routed to the nearest Edge Location.
Content is delivered with the best possible performance.
<em><span>Side Notes:</span></em>
- Edge locations are not just READ only - you can write to them too (ie. put an object on them)
- Objects are cached for the life of the TTL (Time To Live)
- You can clear chached objects by invalidating them, but you’ll be charged
<em><span>Key Terms:</span></em>
<span><strong>Edge Location:</strong> location where content can be chached - separate to an AWS Regions/AZ</span>
<span><strong>Origin:</strong> This is the origin of all the files that the CDB will distribute. Can be an S3 Bucket, an EC2 Instance, Elstic Load balancer or Route53</span>
<span><strong>Distribution: </strong>This is the name given to the CDN which consists of a colection of Edge Locations</span>
CloudFront Distribution Types
- Web Distribution - Websites
- RTMP - Media streaming
CloudFront Signed URLS & Cookies
Always used signed URLs/Cookies when you want to secure content to authorized users
URLs VS Cookies
- Signed URL is for idividual files. 1 File = 1 URL
- Signed Cookie is for multiple files 1 Cookie = multiple files
CloudFront Signed URL
- Can have different oigins. Doesn’t have to be EC2 (but reccomended)
- Utilize chaching and can filter
- Key pair
S3 signed URL
- Issues request from IAM (grant same permissions)
- Lifetime
- Good for a small amout of files
Snowball | Snowball Edge
Snowball
Petabyte-scale data transport solution that can physically transport data on-off locations for a fraction of the cost.
- 50/80 TB options
- Super secure
- Storage
- Import/Export to/from S3
Snowball Edge
- 100 TB
- Same as Snowball
- Also allows for compute functions
- Portable AWS Suite
Snowmobile
- 100PB
- Truck sized moveable data

Storage Gateway
Replicates data from a data center into AWS Cloud
- File Gateway -> Flat files stored on S3
- Volume gateway -> Cached Volumes/Stored volumes
*
Athena
Interactive query service which enables you to analize and query data located in S3 using standard SQL.
- Serverless. Pay per query
- No need to setup
- Works directly with data stored in S3
Can be used to
- Query logs
- generate business reports
- analize cost
- run queries on click stream data
Macie
Security service that uses Machine Learning to NLP(Natural Language Processing) to discover classify and protect sensative data stored in S3
- Uses AI to recognize S3 object that might contain PII
- Dashboard/alerts/reporting
- Can analyze CloudTrails logs
<em>Notes</em>
PII (personally Identifiable Information) - personal data