AWS - SAA Flashcards
https://www.youtube.com/watch?v=Ia-UEYYR44s
Introduction to S3
What is Object Storage(Object-based Storage)
Data storage architecture that manages data as objects, as opposed to other storage architectures:
-file systems which manages data as a file and file heirachy, and
-block storage which data as blocks within sectors and tracks.
S3 provides unlimited storage. No concern about underlying infrastructure. S3 console provides an interface for you to upload and access your data.
1.S3 Object: Objects contain your data. They are like files. Objects consist of: Key (name of the object), Value(sequence of bytes), Version ID (the version of the object), Metadata(additional info attached to object). S3 permits storage size of 0 bytes - 5 Terabytes.
2.S3 Bucket: Buckets hold objects. Buckets can also have folders which in turn hold objects. S3 is a universal namespace so bucket names must be unique.
S3 - Storage Classes
Trade Retrieval Time, Accessibility and Durability for Cheaper Strorage
Cheaper storage as you go down the list…
1.Standard(default): Fast! 99.99% Availability, 11 9’s Durability, Replicated across at least three AZs
2.Standard Infrequently Accessed (IA): Still Fast! Cheaper if you access files less than once a month. Additional retrieval fee is applied. 50% less than Standard (reduced availability)
- One Zone IA: Still Fast! Objects only exist in one AZ. Availability (99.5%), but cheaper than standard IA by 20% less (reduced durability). Data could be destroyed. A retrieval fee is applied.
- Glacier: For long-term cold storage. Retrieval of data can take minutes to hours, but the off is very cheap storage.
- Glacier Deep Archive: The lowest cost storage class. Data retrieval time is 12 hours.
S3 Guarantees: Platform is built for 99.99% availability. Amazon guarantee 99.9% availability. Amazon guarantees 11’ 9s of durability.
S3 - Security
All new buckets are PRIVATE by default
Logging per request can be turned on bucket. Log files are generated and saved in a different bucket (even a bucket in a different AWS account if desired).
Access control is configured using:
1. Bucket Policies - Use a policy to define complex rule access.
2. Access Control Lists - Legacy feature (but not deprecated) of controlling access to buckets and objects. Simple way of granting access
S3 - Encryption
Types of Encryption
Encryption In Transit: Traffic between your local host and S3 is achieved via SSL/TLS
Server Side Encryption (SSE) - Encryption At Rest: Amazon help you encrypt the object data:
S3 Managed Keys : (Amazon manages all the keys)
SSE-AES : S3 handles the key, uses AES-256 algorithm
SSE-KMS: Envelope encryption, AWS KMS and you manage the keys.
SSE-C: Customer provided key (you manage the keys)
Client-Side Encryption: You encrypt your files before uploading them to S3.
S3 - Data Consistency
New Objects vs Overwrite
New Objects (PUTS): Read After Write Consistency. When you upload a new S3 object you are able to read immediately after writing.
OVerwrite (PUTS) or Delete Objects (DELETES): Eventual Consistency. When you overwrite or delete an object an object, it takes time for S3 to replicate versions to AZs. If you were to read immediately, S3 may return you an old copy. You need to generally wait a few seconds before reading.
S3 - Cross Region Replication (CRR)
Cross Region Replication: When enabled, any object that is uploaded will be automatically replicated to another region(s). Provides higher durability and potential disaster recovery for objects.
Versioning must be turned on for both the source and destination buckets for CRR to function.
CRR can replicate to another AWS account.
S3 Versioning
S3 Versioning
- Store all versions of an object in S3
- Once enabled, it cannot be disabled. Only suspended on the bucket.
- Fully integrates with S3 Lifecycle rules
- MFA Delete feature provides extra protection against deletion of your data
S3 Lifecycle Management
S3 Lifecycle Management: Automate the process of moving objects to different storage classes or deleting objects all together
Can be used together with versioning
Can be applied to both current and previous versions
S3 -Transfer Acceleration
S3 - Transfer Acceleration: Fast and secure transfer of files over long distances between your end users and an S3 bucket.
- Utilizes CloudFront’s distributed Edge Locations
- Instead of uploading to your bucket, users use a distinct URL for an Edge Location
- As data arrives at the Edge Location, it is automatically routed to S3 over a specifically optimized netowrk path. (Amazon’s backbone network)
S3 - Presigned URLS
Presigned URLS: Generate a URL which provides temporary access to an object to either upload or download object data. Presigned URLs are commonly used to provide access to private objects. AWS CLI or AWS SDK used to generate presigned URLs.
For example:
One has a web-application that needs to allow users to download files from a password protected part of the web-app. The web-app generates presigned url which expires after 5 seconds. The user downloads the file.
S3 - MFA Delete
MFA Delete: ensures users cannot delete objects unless they provide their MFA code.
MFA Delete can only be enables under these conditions:
1. The AWS CLI must be used to turn on MFA
2. The bucket must have versioning turned on
3. Only the bucket owner logged in as Root User can DELETE objects from bucket
S3 CheatSheet
- Simple Storage Service (S3) object-based storage: Store unlimited amount data without worry of underlying storage infrastructure
- S3 replicates data across at least 3 AZs to ensure 99.99% Availability and 11’9s of durability
- Objects contain your data
- Objects can be size anywhere from 0 Bytes up to 5 Terabytes
- Buckets contain objects. Buckets can also obtain folders which can in tun can contain objects.
- Bucket names are uniques across all AWS accounts. Like a domain name.
- When you upload a file to S3 successfully you’ll receive a HTTP 200 code
Lifecycle Management Objects can be moved between storage classes or objects can be deleted automatically based on a schedule
- Versioning: Objects are given a Version ID. When new objects are uploaded, the old objects are kept. You can access any object version. When you delete an object, the previous object is restored. Once versioning is turned on, it cannot be turn off, only suspended.
- MFA Delete: Enforce DELETE operations to require MFA token in order to delete an object. Must have versioning turned on to use. Can only turn on MFA Delete from AWS CLI. Root Account is only allowed to delete objects
- All new buckets are private by default
- Logging can be turned on a bucket to track operations performed on objects
- Access control is configured using Bucket Policies and Access Control Lists (ACL)
- Bucket Policies are JSON documents which let you write complex control access
- ACLs are the legacy method (not deprecated) where you grant access to objects and buckets with simple actions.
S3 CheatSheet
- Security in Transit: Uploading files is done over SSL
- SSE: Stands for Server Side Encryption. S3 has 3 options for SSE.
1. SSE-AES: S3 handles the key, uses AES-256 algorithm
2. SSE-KMS: Envelope encrption via AWS KMS and you manage the keys.
3. SSE-C: Customer provided key (you manage the keys) - Client-Side Encyption: You must encrypt your own files before uploading them to S3
S3 CheatSheet
- Cross Region Replication (CRR): Allows replication of files across regions for greater durability. Versioning must be turn on in the source and destination bucket. CRR can replicate to bucket in another AWS Account.
- Transfer Acceleration: Provide faster and secure uploads from anywhere in the world. Data is uploaded via distinct URL on Edge location. Data is then transported to S3 bucket via AWS backbone network.
- Presigned URLs: a URL generated via the AWS CLI and SDK. It provides temporary access to write or download object data. Presigned Urls are commonly used to access private objects.
S3 has 6 different Storage Classes:
* Standard: Fast! 99.9% Availability, 11 9’s durability, replicated across at least three AZs
* Intelligent Tiering: Uses ML to analyze object usage and determine the appropriate storage class. Data is moved to the most cost-effective access tier, without any performance impact or added overhead.
* Standard Infrequently Accessed (IA): Still Fast! Cheaper if you access files less than once a month. Additional retrieval fee is applied. 50% less than Standard (reduced availability)
* One Zone IA: Still Fast! Objects only exist in one AZ. Availability (99.5%), but cheaper than Standard IA by 20% less (Reduce durability). Data could get destroyed. A retrieval fee is applied.
* Glacier: For long-term cold storage. Retrieval of data can take minutes to hours but the off is very cheap storage.
* Glacier Deep Archive: The lowest cost storage class. Data retrieval time is 12 hours.
AWS Snowball: Petabyte-scale transfer service
Snowball:
Low Cost - Transfer of 100TB over high speed internet. Snowball can reduce that costs by 1/5th
Speed - Transfer of 100TB over 100 days can take over 100 days. Snowball can reduce that transfer time by completing that task in less than a week.
Move data onto AWS via physical briefcase computer
Snowball features and limitations:
* E-link display (shipping information)
* Tamper and weather proof
* Data is encrypted end-to-end (256-bit encryption)
* Uses Trusted Platform Module (TPM)
* For security purposes, data transfers must be completed within 90 days of snowball being prepared.
* Snowball can import and export from S3.
Comes in two sizes:
* 50 TB (42 TB of usable space)
* 80 TB (72 TB of usable space)
TPM: specialized chip on an endpoint service that stores RSA encryption keys specific to host system for hardware authentication.
Snowball Edge: Petabyte-scale data transfer service
Snowball Edge: Similar to snowball but with more storage and local processing.
Move data onto AWS via physical briefcase computer
More storage and on-site compute capabilities.
Snowball Edge Features and limitations:
* LCD display (shipping information and other functionality)
* Can undertake local processing and edge-computing workloads
* Can use in a cluster in groups of 5 to 10 devices
* Three options for device configurations
* storage optimized (24 CPUs)
* compute optimized (54 CPUs)
* GPU optimized (54 CPUs)
Snowball Edge come in two sizes:
* 100 TB (83 TB of usable space)
* 100 TB Clustered (45 TB per node)
Snowmobile
Snowmobile: 45-foot long ruggedized shipping container, pulled by semi-trailer truck. Transfer up to 100PB per snowmobile.
AWS personnel will help connect on-premise network to the snowmobile. When data is complete, they’ll drive it back to AWS to import into S3 or Glacier.
Security Features:
* GPS tracking
* Alarm monitoring
* 24/7 video surveillance
* an escort security vehicle while in transit (optional)
Snowball & Snowball Edge & Snowmobile CheatSheet
- Snowball and Snowball Edge is a rugged container (briefcase) which contains a storage device
- Snowmobile is a 45-foot long ruggedized shipping container, pulled by a semi-trailer truck.
- Snowball and Snowball Edge is a for peta-scale migration. Snowmobile is for exabyte-scale migragtion.
- Low cost: Thousands of dollars to transfer 100TB over high speed internet. Snowball is 1/5th.
- Speed: 100 TB over 100 days to transfer over high speed internet. Snowball takes less than a week.
- Snowball comes in two sizes: 50 TB (42 TB of usable space) and 80 TB (72 TB of usable space)
- Snowball Edge comes in two sizes: 100 TB (83 TB of usable spaces) and 100 TB Clustered (45 TB per node)
- Snowmobile comes in one size: 100PB
- One can bothe export and import data using Snowball and Snowmobile
- One can import S3 and Glacier
- Snowball Edge can undertake local processing and edge-computing workloads
- Snowball Edge can come in a cluster of groups of 5 to 10 devices
- Snowball Edge provides three options for device configurations: storage optimized (24 vCPUs), compute optimized (54 vCPUs) and GPU optimized (54 v CPUs)
VPC
Virtual Private Cloud
Provision a logically isolated section of the AWS Cloud where you can launch AWS resources in a virtual network that you define
Introduction to VPC
Think of a AWS VPC as your own personla data center. Gives you control over your virtual networking environment
From the outside to inside:
- The internet
- Internet Gateway (IGW)
- Router
- Router Table
- NACL
- Public subnet
- Security group
- EC2 Instance
- NAT
- Private Subnet
- RDS DB
- VPC
- Region
Core Components
Core components: Combining these components and services makes up VPC
- Internet Gateway (IGW)
- Virtual Private Gateway (VPN Gateway)
- Routing Tables
- Network Access Control Lists (NACLs)-stateless
- Security Groups (SG) - Stateful
- Public Subnets
- Private Subnets
- NAT Gateway
- Customer Gateway
- VPC Endpoints
- VPV Peering
VPC Key Features.
- VPCs are Region Specific, they do not span regions
- Can create up to 5 VPC per region
- Every region comes with a default VPC
- Can have 200 subnets per VPC
- Can use IPv4 Cldr Block and in addition to a IPv6 Cldr Blocks(the address of the VPC)
- Cost nothing: VPC’s, route tables, NACLs, Internet Gateways, Security Groups and Subnets, VPC Peering
- Some things cost money: NAT Gateway, VPC Endpoints, VPN Gateway, Customer Gateway
- DNS hostnames (should your instance have domain name addresses)
Default VPC
AWS has a default VPC in every region so one can immediately deploy instances
- Create a VPC with a size 1/16 IPv4 CIDR block (172.31.0.0/16)
- Create a size /20 default subnet in each availability zone
- Create an Internet Gateway and connect it to your default VPC
- Create a default security group and associate it with your default VPC
- Create a default network access control list (NACL) and associate it with your default VPC.
- Associate the default DHCP options set for your AWS account with your default VPC
- When you create a VPC, it automaically has a main route table
VPC: Default Everywhere IP
0.0.0.0/0 is know as default. It represents all possible IP addresses
- When we specify 0.0.0.0/0 in our route table for IGW, we allow internet access
- When we specificy 0.0.0.0/0 in our security groups inbound rules, we are allowing all traffic from the internet access (our public resources)
When you see 0.0.0.0/0, just think of giving access from anywhere or the internet.
VPC Peering
VPC Peering allow connection between one VPC and another over a direct network route using private IP addresses
- Instances on peered VPCs behave just like they are on the same network
- Connect VPCs across same or different AWS accounts and regions
- Peering uses a star configuration: 1 central VPC, 4 other VPCs
- No Transitive Peering (peering must take place directly between VPCs). Needs a one-to-one connect to immediate VPC
- No overlapping CIDR Blocks