S3 Flashcards
Amazon S3
Is an object storage service offering industry-leading scalability, data availability, security, and performance.
-Unlimited
-Many websites use Amazon S3 as a backbone
S3 Use cases
• Backup and storage • Disaster Recovery
• Archive • Hybrid Cloud storage
• Application hosting • Media hosting
• Data lakes & big data analytics
• Software delivery • Static website
S3 Buckets
Allows people to store objects (files) in “buckets” (directories)
• Buckets must have a globally unique name (across all regions all accounts)
• Buckets are defined at the region level
S3 Objects
• Objects (files) have a Key
• Max Object Size is 5TB
• If uploading more than 5GB, must use “multi-part upload”
• The key is composed of prefix + object name s3://mybucket/
my_folder1/another_folder/my_file.txt (THIS)
S3 Security
• User based = IAM policies for users and for services
• Resource Based = Bucket Policies + Public Access (allows cross account) – Object Access Control List (ACL) – Bucket Access Control List (ACL)
• Encryption: encrypt objects in Amazon S3 using encryption keys
S3 Bucket Policies
• JSON based policies
Use S3 bucket for policy to:
• Grant public access to the bucket
• Force objects to be encrypted at upload
• Grant access to another account (Cross
Account
Bucket settings for Block Public Access
• These settings were created to prevent company data leaks
S3 Websites
• S3 can host static websites and have them accessible on the www
S3 - Versioning
• It is enabled at the bucket level
• Protect against unintended deletes (ability to restore a version)
• Easy roll back to previous version
S3 Access Logs
• For audit purposes
• Any request made to S3, from any account, authorized or denied, will be logged into another S3 bucket
• Very helpful to come down to the root cause of an issue, or audit usage, view suspicious patterns, etc…
S3 Replication (CRR & SRR)
• Must enable versioning in source and destination
• Buckets can be in different accounts
• Copying is asynchronous
• Must give proper IAM permissions to S3
• Cross-Region Replication (CRR) - Use cases: compliance, lower latency access, replication across accounts
• Same-Region Replication (SRR) – Use cases: log aggregation, live replication
S3 Storage Classes
• Amazon S3 Standard - General Purpose
• Amazon S3 Standard-Infrequent Access (IA)
• Amazon S3 One Zone-Infrequent Access
• Amazon S3 Glacier Instant Retrieval
• Amazon S3 Glacier Flexible Retrieval
• Amazon S3 Glacier Deep Archive
• Amazon S3 Intelligent Tiering
S3 Durability and Availability
• Durability = High durability (99.999999999%, 11 9’s) of objects across multiple AZ
• Availability = Measures how readily available a service is, varies depending on storage class
S3 Standard – General Purpose
• 99.99% Availability
• Used for frequently accessed data
• Low latency and high throughput
• Sustain 2 concurrent facility failures (3 AZs)
• Use Cases: Big Data analytics, mobile & gaming applications,
content distribution…
S3 Storage Classes – Infrequent Access
• For data that is less frequently accessed, but requires rapid access when needed
• Lower cost than S3 Standard
• S3 Standard-IA = 99.9% Availability & Use cases: Disaster Recovery, backups
• S3 One Zone-IA = In a single AZ, 99.5% Availability, Use Cases: Storing secondary backup copies of on-premise data, or data you can recreate
Amazon S3 Glacier Storage Classes
• Low-cost object storage meant for archiving / backup
• Pricing: price for storage + object retrieval cost
Amazon S3 Glacier Storage Classes 2
Amazon S3 Glacier Instant Retrieval
• Millisecond retrieval
• Minimum storage duration of 90 days
Amazon S3 Glacier Flexible Retrieval
• Expedited (1 to 5 minutes), Standard (3 to 5
hours), Bulk (5 to 12 hours) – free
• Minimum storage duration of 90 days
Amazon S3 Glacier Deep Archive
• For long term storage
• Standard (12 hours), Bulk (48 hours)
• Minimum storage duration of 180 days
S3 Intelligent-Tiering
• Small monthly monitoring and auto-tiering fee
• Moves objects automatically between Access Tiers based on usage
• There are no retrieval charges in S3 Intelligent-Tiering
S3 Object Lock & Glacier Vault Lock
S3 Object Lock:
• Adopt a WORM (Write Once Read
Many) model
• Block an object version deletion for a
specified amount of time
Glacier Vault Lock:
• Adopt a WORM model
• Lock the policy for future edits (can no
longer be changed)
• Helpful for compliance and data retention
S3 Encryption
• No Encryption
• Server-Side Encryption = Server encrypts the file after receiving it
• Client-Side Encryption = User upload encrypts the file before uploading it
AWS Snow Family
• Highly-secure, portable devices to collect and process data at the edge, and migrate data into and out of AWS
• Data migration: Snowcone, Snowball Edge, Snowmobile
• Edge computing: Snowcone, Snowball Edge
AWS Snowcone
• Small, portable computing, anywhere, rugged & secure, withstands harsh environments
• Device used for edge computing, storage, and data transfer
• 8 TBs of usable storage & up to 24TB online and offline
• Can be sent back to AWS offline, or connect it to
internet and use AWS DataSync to send data
Snowball Edge (for data transfers)
• Physical data transport solution: move TBs or PBs of data in or out of AWS
• Provide block storage and Amazon S3
-compatible object storage
• 80TB usable storage - Up to petabytes offline
• Use cases: large data cloud migrations, DC decommission, disaster recovery
Snowball Edge (Storage&Compute) 2
• Snowball Edge Storage Optimized
• 80 TB of HDD capacity for block volume and
S3 compatible object storage
• Snowball Edge Compute Optimized
• 42 TB of HDD capacity for block volume and
S3 compatible object storage
AWS Snowmobile
• Transfer up to exabytes of data
• Each Snowmobile has 100 PB of capacity
• High security: temperature controlled, GPS, 24/7 video surveillance
• Better than Snowball if you transfer more than 10 PB
What is Edge Computing?
• Process data while it’s being created on an edge location
• These locations may have limited internet access or no easy access to computing power
• We setup a Snowball Edge / Snowcone device to do edge computing
• Use cases of Edge Computing: Preprocess data, Machine learning and Transcoding media streams
Snow Family – Edge Computing
• Snowcone = 2 CPUs, 4 GB of memory, wired or wireless access
• Snowball Edge – C-O = 52 vCPUs, 208 GiB of RAM
• Snowball Edge – S-O = Up to 40 vCPUs, 80 GiB of RAM
AWS OpsHub
You can use AWS OpsHub (a software you install on your computer / laptop) to manage your Snow Family Device
AWS Storage Gateway
Hybrid storage service to allow on- premises to seamlessly use the AWS Cloud