All Flashcards
IAM Policy Structure
Version: 2012-10-17
ID (optional): identifier for the policy
Statement: one or more individual statements
Statement consists of:
- Sid: identifier for the statement (optional)
- Effect: whether the statement allows or denies access
- Principal: account/user/role to which applied
- Action: list of resources to which actions are applied
- Condition: condition for when the policy is in effect (optional)
What is the issue for the following errors:
- Your application is not accessible and you get a “timed out” when trying to access it.
- If your application gives a “connection refused” error
- Security Group issue
2. Application error, or it’s not launched (security group worked)
What are these Ports? 21 22 80 443 3389
21 - FTP (File Transfer Protocol) - upload files into a file share
22 - SSH (Secure Shell) - log into Linux instance, but also SFTP (Secure File Transfer Protocol) - upload files using SSH
80 - HTTP access to unsecured website
443 - HTTPS access to a secured website
3389 - RDP (Remote Desktop Protocol) - log into a Windows Instance
How to connect to Linux EC2 using PowerShell?
ssh -i PathTo.pemFile ec2-user@PublicIpAddressOfInstance
Pros and Cons of Cluster Placement Group - All instances on the same server rack in same AZ
Pros: Great Network - 10Gbps bandwidth between instances
Cons: If the rack fails, all instances fail at the same time
Use Case: Big data job that needs to complete fast or app that needs extremely low latency and high network throughput
Pros and Cons of Spread Placement Group - All instances are located on different racks/hardware, and across AZs
Pros: Can spread across AZs for reduced risk of simultaneous failure.
Cons: Limited to 7 instances per AZ per placement group
Use Case: App that needs to maximize high availability and critical apps where each instance must be isolated from failure from each other
Pros and Cons of Partition Placement Group - Each partition is a rack and each partition can have multiples in each AZ and spread across multiple AZ.
Pros: Up to 7 partitions per AZ and can spread across multiple AZs (within the same region) for up to 100s of instances. Don’t share the same hardware so a failure isn’t catastrophic
Use Cases: Big Data apps
What is an ENI?
Elastic Network Interface - acts as virtual network card and is a component of the VPC
They are bound to a specific AZ
What is EC2 Nitro?
New underlying platform for EC2 instances.
Will have higher speed EBS, better security, better networking
Why would you want to change the default vCPU options?
Sometimes licensing is charged based on number of cores. So the default of 2 threads per core and 8 cores (which would be 16 vCPU) could cost a lot. So if you want to keep the same amount of RAM, but don’t need all those vCPU, you can disable multithreading (allow just 1 thread per core) and lower the amount of overall cores to lower the cost of the licensing charges
EBS MultiAttach
Usually an EBS volume can only be attached to ONE ec2 instance at a time. However, with io1/io2 family EBS Volumes, you can attach these to multiple instances within the same AZ.
Which can be mounted in multi AZ?
EFS or EBS
EFS
Can Windows instances have an EFS mounted?
No. Only for Linux
EFS Performance Modes
General Purpose - latency sensitive use cases (web servers)
Max I/O - higher latency, throughput, highly parallel (big data, media processing)
EFS Throughput Modes
Bursting - 1TB = 50MiB/s + burst of up to 100MiB/s
Provisioned - set your throughput regardless of storage size (ex: 1GiB/s for 1TB storage)
What layer is TCP? HTTP? HTTPS? Network?
Network = Layer 3
TCP = Layer 4
HTTP and HTTPS = Layer 7
What protocol and on which port does the Gateway Load Balancer use?
GENEVE protocol on port 6081
Sticky Sessions
This mean that a client accessing an EC2 instance through a load balancer will be directed to the same EC2 instance every time. This is done via a “cookie”. The cookie will expire eventually, and the client will be directed to whichever instance the load balancer sees fit if it is a duration based cookie. You can also create your own without an expiry date
Which load balancer always has Cross Zone Load balancing?
Application Load Balancer. There are no charges for inter-AZ data transfer
It can be enabled for NLB and CLB. Only NLB will charge you for inter-AZ data transfer.
What is SNI (Server Name Indication)?
SNI solves the problem of loading multiple SSL certificates onto one web server (to serve multiple websites)
It’s a newer protocol and requires the client to indicate the hostname of the target server in the initial SSL handshake
Only works for ALB & NLB or CloudFront (not CLB)
In other words, you can have an ALB or NLB balance traffic between 2 different websites at once. When a user wants to access one of the websites, it will use SNI to tell the load balancer which site they want, so the load balancer can select the right SSL certificate, and encrypt the traffic to the correct site
Connection Draining/Deregistration Delay
If using a CLB, it is called Connection Draining. If using an ALB or NLB, it is called Deregistration Delay
This is a setting on an EC2 instance where once it becomes unhealthy, it doesn’t shutdown right away. The load balancer will stop routing new traffic to it, but for the traffic that has already been routed to it, the draining time will allow those people time to finish their task before the instance shuts down. Default is 300 seconds. Can go up to 3600 seconds.
Which is becoming legacy and which is new between “Launch Configuration” and “Launch Template”? (used for auto scaling groups)
Configuration is legacy, template is newer
You are using an Application Load Balancer to distribute traffic to your website hosted on EC2 instances. It turns out that your website only sees traffic coming from private IPv4 addresses which are in fact your Application Load Balancer’s IP addresses. What should you do to get the IP address of clients connected to your website?
When using an Application Load Balancer to distribute traffic to your EC2 instances, the IP address you’ll receive requests from will be the ALB’s private IP addresses. To get the client’s IP address, ALB adds an additional header called “X-Forwarded-For” contains the client’s IP address.
Application Load Balancers can route traffic to different Target Groups based on the following, EXCEPT:
Client Location
Hostname
Request URL Path
Source IP Address
Client Location. ALBs can route traffic to different Target Groups based on URL Path, Hostname, HTTP Headers, and Query Strings.
For compliance purposes, you would like to expose a fixed static IP address to your end-users so that they can write firewall rules that will be stable and approved by regulators. What type of Elastic Load Balancer would you choose?
Network Load Balancer has one static IP address per AZ and you can attach an Elastic IP address to it. Application Load Balancers and Classic Load Balancers have a static DNS name.
What are the RDS database engines?
Postgres MariaDB MySQL Oracle Microsoft SQL Server Aurora
Why is using RDS better than just launching your own database on an EC2 instance?
RDS is managed by AWS which means it has automated provisioning, OS patching, continuous backups, monitoring dashboard, read replicas, can be multi-AZ for disaster recovery, can set up maintenance windows, scaling ability, and backed by EBS.
Downside is that you can’t SSH into it since it is managed by AWS, not you.
Does RDS storage scale automatically?
Yes. You can set a maximum
Can RDS read replicas span across AZs?
Read Replicas can be within an AZ, cross AZ, or even cross region
Are RDS read replicas Asynchronous? (meaning that you can read them before they have a chance to match the main database exactly)
Yes, so reads are EVENTUALLY consistent.
What is RDS Multi AZ?
A feature used mainly as a disaster recovery in which there is a SYNCHRONOUS read replica created that is completely unused unless the main fails. This read replica will share the DNS Name with the master database so if that master database fails, the read replica will automatically take over as master
How much downtime is average for going from a single AZ RDS database to a mult-AZ database?
There is no downtime
If you create an RDS Database, and elect to not encrypt it, how can you later encrypt the read replicas made from this database?
You can’t.
At what step do you encrypt an RDS database?
Must be defined at launch time
What are 2 ways to encrypt your RDS Database?
AWS KMS - AES-256
Transparent Data Encryption (TDE) (only available for Oracle or SQL Server)
How do you encrypt an unencrypted RDS database?
Create a snapshot of the unencrypted database, copy that snapshot, and when you create that copy, you’ll have the option to enable encryption. Then restore the database from the encrypted snapshot. This creates a new, encrypted database to which you can migrate everything over to. Then delete the unencrypted database
Which database engines support access management to the RDS database via IAM authentication?
MySQL and PostgreSQL
RDS Shared Responsibility
Your responsibility:
- Check the ports/IP/security group inbound rules
- In-database user creation and permissions or manage through IAM
- Creating a database with or without public access
- Ensure parameter groups or DB is configured to only allow SSL connections
AWS responsibility: Since RDS is a managed service, you will have:
- No SSH access
- No manual DB patching
- No manual OS patching
- No way to audit the underlying instance
Aurora FAQs
- Not open sourced
- 5x performance over MySQL and over 3x for Postgres
- Storage automatically grows from 10GB up to 128TB
- Can have 15 read replicas and the process is much faster
- Failover is basically instant
- Costs ~20% more than RDS, but much more efficient
- Makes 6 copies of your data across 3 AZs
- One master, but any of the replicas can become master for failover
- Supports cross region replication
Aurora DB Cluster
There is a master that does read and write. But instead of you connecting to the master directly, you connect to a WRITER ENDPOINT that directs you to the master. (Good incase the master fails, you don’t have to find the new master). Same thing for reading the read replicas. You don’t connect to them directly, but instead, you connect to a READER ENDPOINT which also acts as a load balancer for all the replicas. Replicas can also auto scale.
Aurora Security
Similar to RDS.
Aurora Custom Endpoints
Aurora automatically has a Reader Endpoint to guide you to read replicas and load balance. However, if you have a variety of instance sizes making up the read replicas, you may want to use larger instance sizes for more work intensive queries. To do this, you can create custom endpoints that override the default reader endpoint.
Elasticache FAQs
- Caches are in-memory databases with really high performance, low latency
- RDS is to get managed relational databases as Elasticache is to get managed Redis or memcached
- Helps reduce load off of databases for read intensive workloads
- Helps make your app stateless
- AWS takes care of OS maintenance/patching, optimizations, setup, config, monitoring, failure recovery and backups
- Involves heavy app code changes to use
- Works kind of like CloudFront but for databases. App needs something from the database, but will check Elasticache first to see if it’s been cached.
Redis vs Memcached
Redis:
- Multi AZ with auto failover
- Read replicas to scale reads and have high availability
- data durability using AOF persistence
- backup and restore features
Memcached
- Multi-node partitioning of data (sharding)
- No high availability (replication)
- No persistence
- No backup and restore
- Multi-threaded architecture
ElastiCache Security
- Does not support IAM Authentication
- Redis AUTH lets you create a password (token) when you create a Redis user and supports SSL in flight encryption
- Memcached supports SASL-based authentication
Patterns for ElastiCache
Lazy Loading: all the read data is cached, data can become stale in cache
Write Through: Adds or updates data in the cache when written to DB (no stale data)
Session Store: store temp session data in a cache (using TTL features)
ElastiCache - Redis Use Case
- Gaming leaderboards are computationally complex
- Redis Sorted Sets guarantee both uniqueness and element ordering
- Each time a new element is added, it’s ranked in real time, then added in correct order
FQDN (Fully Qualified Domain Name): Protocol: Domain Name: Sub Domain: Second Level Domain: Third Level Domain: Root:
FQDN (Fully Qualified Domain Name): http://api.www.example.com. Protocol: http Domain Name:api.www.example.com. Sub Domain:www.example.com. Second Level Domain:example.com. Third Level Domain:.com. Root: .
How DNS Works
When you type in your URL (example.com), you are asking your local DNS Server for example.com. If your local DNS server (assigned by your company or ISP) doesn’t know it, the local DNS server will go ask the Root DNS Server (managed by ICANN). If the Root DNS server doesn’t know it, it will at least tell you where to look and give you the info for the TLD DNS Server since you are looking for a .com website. If the TLD DNS Server doesn’t know, it can give you a bit more info and lead you to the SLD DNS Server, which is the Domain Registrar (Route 53, GoDaddy, etc), and they will know the IP Address for example.com. Your local DNS Server will now cache that info
Is Route 53 authoritative?
yes
Main Route 53 DNS record types?
A
AAAA
CNAME
NS
A:
AAAA:
CNAME:
NS:
A: maps a host name to IPv4
AAAA: maps a host name to IPv6
CNAME: maps a host name to another host name
- the target is a domain name that must have an A or AAAA record
- Can’t create a CNAME record for the top node of a DNS namespace (Zone Apex) (For example, you can’t create for example.com, but you can for www.example.com
NS: Name Servers for the hosted zone
- Control how traffic is routed for a domain
Route 53 Hosted Zones
A container for records that define how to route traffic to a domain and its subdomains
- Public Hosted Zones: contains records that specify how to route traffic on the internet (public domain names)
- Private Hosted Zones: contain records that specify how you route traffic within one or more VPCs (private domain names)
You pay $0.50 per month per hosted zone
CNAME vs Alias
AWS Resources (load balancers, CloudFront, etc) expose an AWS host name such as lb1-1234.us-east-2.elb.amazonaws.com. But if you want to use myapp.mydomain.com you can use:
CNAME - Point the ugly hostname to the pretty hostname. But ONLY for non-root domains. So it’ll work for myapp.mydomain.com, but not mydomain.com
Alias - Specific to Route 53 and allows you to point the hostname to an AWS resource. This does work for root domains as well as non-root domains. The are also free
Alias record targets
ELB CloudFront API Gateway Elastic Beanstalk S3 websites (not the bucket, but the website) VPC Endpoints Global Accelerator Route 53 Record (same Hosted Zone)
CANNOT for EC2 DNS name
Route 53 Routing Policies: Simple Weighted Failover Latency Based Geolocation Multi-Value Answer Geoproximity (using Route 53 Traffic Flow feature)
Simple - Typically route traffic to a single source (ask for a.com, get back 11.2.55.213) but can be multiple sources, and one is picked at random (ask for x.com, get back 1.2.3.4 as well as 5.6.8.7) (no health checks)
Weighted - Route traffic to multiple sources, but assign values to say one source is more or less likely than others (dns records must have the same name and type and there can be health checks) Typically used for load balancing across regions or testing a new app by only sending a small percentage of traffic to the testing instance. If one instance is given 0 weight, no traffic will go there, but if ALL instances are given the weight of 0, they will all return traffic equally
Failover:
- Active/Passive: If an instance fails a health check, route 53 will direct traffic to the back up instance
Latency Based - Auto Directs traffic based on how quickly they can access the instance. (health checks available)
Geolocation: Based on where the user is located. Create a “default” record first in case there is no matching location. (health checks available) Does not auto direct, you set which locations go to which instance.
Multi-Value Answer - Up to 8 records will be available for a user to access. (health checks available)
Geoproximity (using Route 53 Traffic Flow feature) - route traffic based on both user location as well as resource location. Use bias values to give weight.
(example: there is a resource set up in N. Virginia and another set up in California, and there are 4 users accessing this resource. User locations are Virginia, California, Louisiana, New Mexico. If bias on both resources is set to 0, it will act like latency based routing. But if you give N. Virginia a bias of 50, and California a bias of 0, then that means N. Virginia has a wider reach, and will take the users in Virginia and Louisiana of course, but now will also take the New Mexico user)
Route 53 Health Checks
HTTP health checks are only for PUBLIC resources and will check if the resource is working before sending traffic.
- Monitor at Endoint: About 15 different health checks will report
- Calculate health checks: Combine health checks into one
- Private Hosted Zone: Since health checks can’t access instances in a private subnet, you would create a CloudWatch Alarm to go off when the instance becomes unhealthy, and the health checker can watch for that alarm instead of trying to watch the instance
Route 53 Traffic Flow
This is a UI to more easily create and manage DNS routing records
Elastic Beanstalk Cost?
Free itself, only pay for the services provisioned.
S3 Keys
Files have a key. If the bucket URL is s3://mybucket, then a jpg named bob in that bucket will have the URL of s3://mybucket/bob.jpg. If you place the bob.jpg in a folder call dude, then the URL for the jpg will be s3://mybucket/dude/bob.jpg. The key for the jpg is everything after the s3://mybucket. So the key for the bob.jpg in the dude folder would be dude/bob.jpg
To break down the key, dude/ would be the prefix and bob.jpg would be the object name
S3 Encryption Options
SSE-S3 - keys handled and managed by AWS
- AES-256
- We upload the object, S3 provides and applies the key
- Header will be: “x-amz-server-side-encryption”:”AES256”
SSE-KMS - AWS KMS to handle and manage keys
- Advantages are user control + audit trail
- We upload the object, S3 provides and applies the key
- Header will be: “x-amz-server-side-encryption”:”aws:kms”
SSE-C - You manage your own keys
- S3 does not store the key
- HTTPS must be used
- We upload the object and the key, but S3 will still apply the key to encrypt
Client Side Encryption
- You encrypt the object before uploading to S3 and decrypt it when retrieving it
S3 Security
User based: IAM Policies - Which API calls should be allowed for a specific user from IAM console
Resource based: Bucket Policies - bucket wide rules from S3 console, allows cross account
SDK
Must use when coding against AWS Services such as DynamoDB
S3 MFA-Delete
Must enable versioning
Only the root user can enable/disable MFA-Delete
Can only be enabled using the CLI, not the console
Once enabled, you’ll need to get an MFA code to permanently delete an object or suspend versioning
Amazon Glacier Retrieval Options
Expidited - 1 to 5 minutes
Standard - 3 to 5 hours
Bulk - 5 to 12 hours
(minimum storage duration is 90 days)
Amazon Glacier Deep Archive Retrieval options
Standard - 12 Hours
Bulk - 48 hours
Minimum storage duration is 180 days
S3 KMS Limitation
KMS does have a limit of requests per second (varies by region), so if you have this as your default encryption on your S3 bucket, then it’s possible to hit that limit
S3 multi-part upload
Recommended for files over 100MB
Required for files over 5GB
S3 Transfer Acceleration
Increase transfer speed by transferring file to an AWS edge location which will forward the data to the S3 bucket in the target region. This means that we only use the public internet to get it to the edge location, then use the private AWS network to get it from the edge location to the bucket
S3 byte range fetches
Can be used to speed up downloads. Downloads in parts instead of all in a single file
S3 select and Glacier Select
This allows you to use SQL to make requests from specific parts of a file. So if you only need a few rows and columns from a CSV file, you can request this, and S3 will filter out what you want, and deliver it
S3 - Requester Pays
In general, bucket owner pays for all costs. This will allow for the requester to be billed for the request. (bucket owner still pays for the storage, just not the retrieval)
Athena
Serverless query service to perform analytics against S3 objects
uses SQL to query
Unicast vs Anycast IP (Global Accelerator uses Anycast IP)
Unicast IP: one server holds one IP address
Anycast IP: All servers hold the same IP address and the client is routed to the nearest one. So a user will want to connect to your ALB. Instead of connecting via the public internet, they will connect to the closest edge location, then travel the rest of the way over the AWS private network
Snowball Edge
FOR STORAGE ONLY:
40 vCPUs
80TB
Use for data transfers of less than a petabyte
FOR EDGE COMPUTING:
52 vCPUs, 208GB RAM
Optional GPU
42TB storage
Snowcone
FOR STORAGE ONLY:
8TB
Use for data transfers up to 24TB
FOR EDGE COMPUTING: 2CPUs 4GB memory wired or wireless access USB-C power with optional battery
Snowmobile
100PB
AWS OpsHub
A software you install locally to give a GUI for using the snow devices. (without this, a CLI is needed)
Amazon FSx
Allows you to launch 3rd party high-performance files system on AWS such as:
FSx for Lustre
FSx for Windows File Server
FSx for Windows File Server
EFS is a shared POSIX system for Linux systems (so can’t use for windows instances). So this is when you’d use FSx for Windows File Server
Supports SMB protocol & Windows NTFS
Scalable
Can be multi AZ
backed up to S3
FSx for Lustre
A type of parallel distributed file system for large-scale computing
The name Lustre is a combo of Linux and cluster
Used for things like Machine Learning and High Performance Computing (HPC)
Scalable
Seamless integration with S3 (read and write)
FSx File System Deployment Options
Scratch File System: Temporary and data is not replicated. Very Fast 200MBps per TB). Used for short term processing
Persistent File System: Long term storage and data is replicated within same AZ. Replaces failed files within minutes. Use for long term processing
How do you expose S3 for on prem access?
Storage Gateway
Storage Gateway
bridge between on prem and cloud
3 TYPES
File Gateway:
- Configured S3 buckets are accessible using the NFS and SMB protocol
- Supports S3 standard, IA, One Zone IA
- Bucket access using IAM roles for each file gateway
- Most recently accessed data is cached in the file gateway
- Can be mounted on many servers
- Integrated with Active Directory (AD) for user authentication
Volume Gateway -
- Block Storage using iSCSI protocol backed by S3
- Backed by EBS snapshots which can help restore on prem volumes
- There are both Cached volumes for low latency access to most recent data, or Stored Volumes to have the entire dataset on prem with scheduled backups to S3
Tape Gateway - Some companies have backup processes using physical tapes
- Virtual Tape Library (VTL) backed by S3 and Glacier
Storage Gateway Hardware
Others require a gateway to be installed like software on your own servers. If you don’t have room for that, then you can buy a piece of hardware
FSx File Gateway
Native access to FSx for Windows File Server
Local cache for frequently accessed data
AWS Transfer Family
Fully managed service to transfer to and from S3 and EFS using the FTP protocol
Uses either FTP, FTPS or SFTP (FTP is the only one that isn’t encrypted)
Storage Comparison
S3: Object storage
Glacier: Object Archival
EFS: Network file system for LINUX
FSx for Windows: same as EFS but for windows
FSx for Lustre: HPC Linux file system
EBS: network storage for one EC2 instance at a time
Instance Store: Physical store for EC2. Faster than EBS, but goes away when EC2 shutsdown
Storage Gateway: transfers on-prem to and from cloud
Snow family: same as storage gateway, but with phyical devices instead of over network
Database: indexing and querying
SQS
Default Retention: 4 days, maximum of 14 days
Less than 256KB per message
SQS Dead Letter Queue
If a message is failed to be processed, it goes back to the queue, and if this keeps happening, you can have the message leave the main queue and enter a “Dead Letter Queue” so you can review it later and debug the issue
Kinesis
Data Streams: capture, process, and store data streams
Firehose: load data streams into AWS data stores
Data analytics: analyze data streams with SQL or Apache Flink
Video Streams: capture, process, and store video streams
Kinesis Data Streams
Streams are made up of shards
Billing is per shard provisioned
retention is between 1 and 365 days
Kinesis Firehose
Stores data from data producers (like Kinesis Data Streams, CloudWatch, IoT), will batch things together to write in batches to consumers like S3, Redshit, or ElasticSearch
Pay for data going through.
Kinesis Data Streams vs Kinesis Data Firehose
Data streams:
- Streaming service for ingest at scale
- Write custom code
- Real time (~200ms)
- Managed scaling (shard splitting/merging)
- Data storage for 1 to 365 days
- Supports replay capability
Firehose:
- Load streaming data into S3, Redshift, ElasticSearch
- Fully Managed
- Near real time (since it does things in batches)
- Auto scaling
- No data storage
- Doesn’t support replay capability
Kinesis Data Analytics (SQL Application)
Data Streams and Firehose will be the source of data sent to Data Analytics, then you write SQL code to process this data in real time, and have the results of the SQL query either sent back to data streams to be sent to Lamda or EC2 apps, or to firehose, which can send to S3, Redshift, ElasticSearch
Pay for data that goes through it
Amazon MQ
Use when migrating a non-AWS queueing system from on-prem to the cloud without re doing everything
ECS Task Roles
Allow each task to have a specific role and use different roles for different ECS services you run
How do ECS tasks share data?
You can mount EFS volumes to tasks, doesn’t matter if they are launched in an EC2 instance, or if they are launched via Fargate. It is also multi-AZ
ECS Scaling
Service CPU Usage - Lauch a new task if CPU usage goes above certain percentage or vice versa. If you are using EC2 instances instead of fargate, you also have the option to use ECS Capacity Provider scaling that will launch a new instance if your current instances don’t have enough room to launch a new task when the tasks need to scale out
SQS Queue - have a queue set up between users and tasks, once the queue gets to be too long, scale. ECS Capacity Provider available here as well
ECS Rolling Updates
When updating from v1 to v2, we can control how many tasks can be started and stopped and in which order.
So if you have 4 tasks on V1, and need to update to V2. You can set 50% min and 100% max. So of the 4 tasks running, 2 will go offline to update. Once those 2 come back online as V2, the other two V1s will go offline to update. So you never have less than 50% of your 4 tasks running
ECR
Used to store, manage and deploy containers.
EKS (K = Kubernetes)
Way to launch managed kubernetes clusters. Alternate to ECS. EKS is open source, while ECS isn’t.
Tasks are called Pods in EKS
Serverless services in AWS
Lamda DynamoDB Cognito API Gateway Amazon S3 SNS SQS Kinesis Data Firehose Aurora Serverless step functions Fargate
Lamda
Virtual functions limited by time (15 minutes) run on demand scaling is automated up to 10GB or RAM per function Increasing RAM will also improve CPU and network Supports the following languages: - Node.js - Python - Java - C# - Golang - C#/Powershell - Ruby - Custom runtime API (to run any language)
Containers on Lamda
The container image must implement the Lamda Runtime API. If it is an arbitrary Docker image without the lamda runtime API, then use ECS/Fargate
Main Lamda Integrations
API Gateway - create a rest API Kinesis - DynamoDB - S3 - CloudFront - CloudWatch Events/Event Bridge - CloudWatch Logs - SNS - SQS - Cognito -
Lamda use case examples
- You can set up a lamda function with S3 so that when an image is uploaded to S3, the lamda function will trigger, and create a thumbnail version of that image and place it in the same or a different bucket. You can also have it tied to DynamoDB so that you can store the thumbnails metadata (like creation date, dimensions, etc)
- CRON job can be set up. (CRON = a scheduling task). You set up a CloudWatch Events/Event Bridge to go every X amount of minute/hours, and this will trigger the Lamda function. Better than using EC2 since the EC2 will have to run constantly, and lamda would only run when triggered
Lamda pricing
First 1,000,000 requests are free and then $0.20 per million requests after that. You also pay per duration of compute time. First 400,000 GB-second are free. (this means you have 400,000 free seconds of compute time if the function is 1GB. If it’s only 128MB, then you get 3.2 million seconds). Then it’s $1 per 600,000 GB-seconds after that
Lamda Limits (per region)
Execution:
- Memory Allocation: 128MB - 10GB (1MB increments)
- Maximum execution time: 900 seconds (15 minutes)
- Environment variables (4kb)
- Disk capacity in the “function container” (in /tmp): 512MB
- Concurrent executions: 1000 (can be increased)
Deployment:
- Lamda function deployment size (compressed .zip): 50MB
- Size of uncompressed deployment (code + dependencies): 250MB
- Can use the /tmp directory to load other files at startup
- Size of environment variables: 4KB
Lamda@edge
Deploy lamda functions at edge locations
Can change cloudfront requests and responses. Basically, when a user access CloudFront, (to get access to your website/app, and have the origin return a response) Lamda can modify the sent messages to cloudfront from the user, the sent message from cloudfront to the origin, the response from the origin to cloudfront, and the response from cloudfront back to the user.
DynamoDB
Fully managed, highly available (replication across multiple AZs)
NoSQL
millions of requests per second
low latency retrieval
Made of Tables
- each table has a primary key that is created at the time the table is created
- each table can have an infinite number of items (rows)
- each item has attributes (can be added over time - can be null)
- maximum item size of an item is 400KB
Data types supported are:
- Scalar
- Document
- Set
DynamoDB - Read/Write Capacity Modes
Control how you manage your table’s capacity
Provisioned Mode:
- Specify the number of reads/writes per second at time of creating
- So you need to plan capacity beforehand
- Pay for what you provisioned
- Possibility to add auto-scaling
On-Demand Mode:
- Read/writes auto scale
- No capacity planning needed
- Pay for what you use, but this is more expensive
DynamoDB Accelerator (DAX)
In-Memory Cache
helps solve read congestion by caching
5min TTL for cache (default)
DyanmoDB Streams
Ordered stream of item-level modifications (create/update/delete) in a table Stream records can be: - sent to kinesis data streams - read by lamda - read by kinesis client library apps
Data retention for up to 24 hours
DynamoDB Global Tables
Having the same table replicated in multiple regions. “Active-Active” replication.
Can READ and WRITE to the table from any region
Must enable DynamoDB streams as a pre-requisite
DynamoDB TTL (time to live)
Can set an expiry time for items or rows to automatically delete after a certain amount of time
DynamoDB Indexes
GSI - Global Secondary Index
LSI - Local Secondary Index
Allow to query on attributes other than the primary key
DynamoDB Transactions
Can write to more than one table at once (or neither).
Example: if your app can process a bank transaction, you’d want to write to the table that is the ledger of transactions, but also write to the table that lists the account balance. must write to both. Can’t write to one without the other
API Gateway
If you want a client to be able to enovke a lamda function, then there are a few ways like letting them talk directly to the lamda, but that means the client would need IAM permissions. Could place an ALB between the client and lamda, and that would expose the lamda as an HTTP endpoint.
OR
We could use API Gateway. This will create a rest API that will be public and accessible. And the API Gateway would send along the client requests to the lamda
API Gateway Integrations
Lamda Function
HTTP
AWS Service (expose any AWS API through API Gateway)
API Gateway - Endpoint Types
Edge-Optimized (default):
- For global clients
- requests are routed through edge locations
- the API Gateway still lives in only one region
Regional:
- For clients in the same region
- Can manually combine with cloudfront
Private:
- Can only be accessed from your VPC using an interface VPC endpoint (ENI)
API Gateway - Security
IAM Permissions: Can create an IAM policy for a user/role and the API gateway will verify permissions before sending to the backend app. (Uses “Sig v4” to do so). Only for internal use as you can’t give IAM permissions to someone outside the company.
Lamda Authorizer (formerly Custom authorizers): Uses Lamda to validate the token in header being passed. Option to cache authentication.
Cognito User Pool: Cognito fully manages user lifecycle and API gateway verifies identity with cognito. Only authentication, not authorization. So user gets a token from cognito once authenticated, then user sends token to API gateway to authorize based on the token
Cognito
Used when we want to give our users an identity so that they can interact with our app.
Cognito User Pools:
- Sign in functionality for app users
- Integrate with API Gateway
Cognito Identity Pools (Federated Identity):
- Provide AWS credentials to users so they can access AWS resources directly
- Integrate with cognito user pools as an identity provider
Cognito Sync:
- Sync data from device to cognito
- May be replaced by AppSync
Cognito User Pools (CUP)
Create a serverless database of user for your mobile apps
Simple Login: username/email and password combo
can enable federated identities (facebook, google etc logins)
sends back JSON web token that can be used with API Gateway for authentication
Federated Identity Pools
Goal:
- Provide direct access to AWS resources from the client side
How:
- Log in to federated identity provider - or remain anonymous
- Get temp AWS credentials that come with IAM permissions
ex: provide temp access to write to S3 using Facebook login
SAM - Serverless Application Model
Framework for developing and deploying serverless apps
- All config is YAML code
Databases Decisions
What are the needs?
- read heavy? write heavy? balanced?
- throughput needs? consistent or changing?
- how much data store? for how long? will it grow? object size? how are they accessed?
- Data durability? source of truth for the data?
- latency requirements? concurrent users?
- data model? how will you query the data? joins? structured? semi-structured?
- strong schema? flexibility? noSQL?
- license costs?
Database Options
- RDBMS (=SQL/OLTP): RDS, Aurora - great for joins
- NoSQL database: DyanmoDB (similar to JSON docs), ElastiCache (key/value pairs), Neptune (graphs) - No joins, noSQL
- Object Store: S3 (for big objects)/Glacier (for backups/archives)
- Data warehouse: (=SQL analytics/Bi): redshift (OLAP), Athena
- Search: ElasticSearch (JSON): free text, unstructured searches
- Graphs: Neptune: display relationships between data
RDS Engines
Managed PostgreSQL/MySQL/Oracle/SQL Server
Aurora Engines
PostgreSQL/MySQL
ElastiCache Engines
A cache, so Redis/Memcached
DyanmoDB Engines
AWS proprietary
Redshift Engine
PostgreSQL
CloudWatch Dashboard
Global, so they can include graphs from different regions and even different AWS accounts
3 Dashboards (up to 50 metrics) for free $3 per dashboard per month after that
CloudWatch Logs for EC2
By default, no logs are sent to CloudWatch. Must run a CloudWatch agent on EC2 to do so. Also, need to make sure IAM perms are correct
Can also install the agent on an on-prem server as well
EC2 Instance Recovery
Instance Status: check the EC2
System status: Check the underlying hardware
Recovery: If instance goes into alarm, then a cloudwatch alarm can start a recovery
AWS Config
Helps record config changes over time
Regional, but can be aggregated across regions and accounts and even stored in S3 to be later queried by Athena
CloudWatch vs CloudTrail vs Config
CloudWatch:
- Performance monitoring and dashboards
- Events and Alerts
- Log aggregation & analysis
CloudTrail:
- Record API calls within your account by everyone
- Can define trails for specific resources
- Global Service
Config:
- Record config changes
- Evaluate resources against compliance rules
- Get timeline of changes and compliance
CloudWatch/CloudTrail/Config uses with an ELB
CloudWatch:
- Monitor incoming connections metric
- Visualize error codes as a % over time
- Make a dashboard to get an idea of your load balancer performance
Config:
- Track security group rules for the ELB
- Track config changes for the ELB
- Ensure an SSL cert is always assigned to the ELB
CloudTrail:
- Track who made any changes to the ELB with API calls
STS - Security Token Service
Grant limited access to AWS resources for 1 hour (can be refreshed)
When you see “cross account access” think STS
Using STS to assume a role
- Define an IAM Role within your account or cross account
- Define which principals can access this IAM roles
- Use STS to retrieve credentials and impersonate the IAM role you have access to
- Valid from 15 minutes to 1 hour
Identity Federation
Federation lets users outside of AWS to assume temp roles for accessing AWS resources.
Using federation, you don’t need to create IAM users. (user management is outside of AWS)
Identity Federation: SAML 2.0
Integrate with Active directory
provide access to console or CLI
no need to create an IAM user for each of the employees
ex: Employee in an on-prem location needs to access aws console. the employee will browse to the on-prem identity provider that will verify with an identity store. This will authenticate the employee to then get an authorization from STS and assume a role that will grant them access to the console
This is the “old way”, and the new way is “Single SignOn”
Identity Federation: Custom Identity Broker Application
Use only if provider is not SAML compatible. Similar to SAML, but the difference in the example is that instead of the employee requesting authorization after authentication from the identity provider, the identity provider will request authorization, then pass the credentials to the employee
Identity Federation: AssumeRoleWithWebIdentity
Not recommended by AWS, use Cognito instead
With Cognito:
Goal - provide direct access to AWS resources from the client side (mobile, webapp)
Example - provide temp access to write to S3 bucket using Facebook login
Problem - We don’t want to create IAM users for our app users
How - Login to federated identity provider (facebook, google, etc) or remain anonymous and get temp access from the federated identity pool. Comes with a pre-defined IAM policy stating their permissions
Microsoft Active Directory
Database of objects: User accounts, computers, printers, files shares, security groups
Centralize security management
objects are organized in trees
AWS Directory Service
AWS Managed Microsoft AD (Active Directory)
- Create your own AD in AWS, manage users locally, supports MFA
- Establish a “trust” connection with your on-prem AD.
- Can authenticate on AWS or on-prem ADs
AD Connector
- Directory gateway (proxy) to redirect to on-prem AD
- Users are managed on the on-prem AD
- Only authenticate with on-prem AD. Requests through AWS AD are forwarded to on-prem AD
Simple AD
- AD-compatible managed directory on AWS
- Cannot be joined with on-prem AD
- Just the AWS AD
AWS Organizations
- Global Service
- Allows to manage multiple AWS accounts
- The main account is the master account (can’t change)
- Other account are “member” accounts
- Member accounts can only be part of one organization
- Consolidated Billing across all accounts - single payment method
- Pricing benefits from aggregated usage (volume discounts)
- API is available to automate AWS account creation
Multi Account Strategies
- Keep services from talking to each other
- Compliance
- Staggered billing
Can still enable cloudwatch on all accounts and send logs to a central S3
Organizational Units (OU)
A way to separate accounts for security purposes.
Ex: Sales OU, Retail OU, Finace OU, HR OU etc.
Service Control Policies (SCP)
- Whitelist or Blacklist IAM actions
- Applied at OU or account level
- Does not apply to master account
- SCP is applied to all the users and roles of the account, including root
- SCP doesn’t allow anything by default
AWS Organization - Moving Accounts
To move an account from Org 1 to Org 2, the account must:
- Be removed from Org 1
- Get an invite to Org 2
- Accept invite to Org 2
If you want to move the master account, you must:
- Remove ALL the member accounts from Org 1
- Delete Org 1
- Repeat process above to get to Org 2
IAM Conditions
- aws:SourceIP: restrict the client IP FROM which the API calls are being made
(This mean that you can do things like deny access to everything unless it comes from a specific IP) - aws:RequestedRegion: restrict the region the API calls are made TO
(This means you can allow access to certain services ONLY if the request comes from a certain AWS Region)
can also have tag based restrictions as well as MFA based restrictions
IAM Permission Boundries
- Supported for users and roles, but NOT groups
- Can set the maximum permissions an IAM entity can get
Ex: You can create a user and give him the Admin policy. But then you can set a permissions boundary to only allow access to S3. The boundary will overrule the regular admin policy, and he could only do S3 things
Resource Access Manager (RAM)
Share AWS resources that you own with other AWS accounts or within your org such as:
VPC Subnets:
- allows all resources to be launched in the same subnets
- must be from the same org
- cannot share security groups or default VPC
- participants can manage their own resources
- participants can’t view/modify/delete resources that don’t belong to them
AWS Single Sign-On (SSO)
Centrally managed Single Sign-On to access multiple accounts and 3rd party apps
Integrated with organizations
Supports SAML
Integration with On-prem AD
centralize perm management
centralized auditing with cloudtrail
KMS (Key management service)
CUSTOMER MASTER KEY (CMK) TYPES Symmetric (AES-256 keys): - First offering - Necessary for envelope encryption - You never get access to the key unencrypted (muist call KMS API to use)
Asymmetric (RSA&ECC key pairs)
- Public key to encrypt, private key to decrypt
- Used for encrypt/decrypt, but also sign/verify ops
- Public key is downloadable, but you can’t access the private key unencrypted
- Use case: encryption outside AWS where the user can’t call the KMS API
KMS FAQs
- Able to audit key usage using CloudTrail
- 3 types of CMKs
- AWS Managed Service Default CMK: free
- User Keys created in KMS $1/month
- User Keys imported (must be 256 symmetric) $1/mo
- Pay for API call to KMS ($0.03/10,000 calls)
- Can only encrypt 4kb of data per call, if more than 4kb, use envelope encryption
- To use, make sure both the key policy allows the user and the IAM policy allows the API calls
- Keys are bound to a specific region.
Copying a KMS encrypted data store (such as EBS) across regions
To do this, you must first take a snapshot of the volume. This snapshot will also be encrypted with KMS. Then you copy that snapshot to the new region. This will retain the KMS encryption, but will change the key. Then recreate the volume using the snapshot which will remain encrypted with the newly created KMS Key
KMS Key Policies
- Cannot control access without the policies
Default Key Policy:
- created if you don’t provide a specific policy
- Grants complete access to the key to the root user
- To give someone access to the keys, create the IAM policies to access the KMS Key
Custom KMS Policy:
- Define users, roles, who can admin
- Useful for cross-account access of keys
Copying snapshots across accounts
- Create snapshot encrypted with your own CMK
- Attach key policy to authorize cross-account
- Share encrypted snapshot
- The receiver would need to create a copy of the snapshot, encrypt it with a KMS key in their account and create the volume from that snapshot
KMS Automatic Key Rotation
Only for customer managed CMK (not the default AWS managed CMKs)
- if enabled, will be every year
- previous key is kept active to decrypt old data
- new key has same ID but the backend is different
KMS Manual Key Rotation
- Can pick rotation periods (90days, 180 days)
- New key has different ID since you create it manually
- Keep previous key active
- better to use aliases* (hides the change of key for the app)
- Good solution to rotate CMK that are not eligible for auto-rotation (like asymmetric CMK)
*Aliases are basically fronts between the app and the actual key. The app calls on the alias, and the alias will direct the app to the correct key. So you can change keys all the time, but the app won’t know, because it is only in contact with the alias
SSM Parameter Store
- Secure Storage for config and secrets
- Optional seamless encryption using KMS
- Serverless, scalable, durable, easy SDK
- Versioning
- Can be used with cloudwatch events
- Integrated with cloudformation
Parameter Policies (only available for advanced tier, not free tier)
- Allow to assign TTL to a parameter to force updating or deleting sensitive data such as passwords
- Can assign multiple policies at a time
AWS Secrets Manager
- Newer service
- Meant for storing secrets
- Capability to force rotation every X days
- Automate generation of secrets on rotation (uses Lamda)
- Integration with RDS
- encrypted with KMS
Anytime you see “secret store” “secret manager” “rds integration”, think Secrets Manager
CloudHSM
- KMS = AWS manages the software for encryption
- CloudHSM = AWS provisions encryption hardware
- Dedicated Hardware (HSM = Hardware Security Module)
- You manage your own encryption keys entirely (not AWS)
- HSM devices is tamper resistant
- Supports symmetric and asymmetric keys
- no free tier
- Must use the CloudHSM Client Software
- Redshift integration
- Multi AZ
- Good option to use with SSE-C encryption
Shield (DDoS protection)
Shield Standard:
- Free and auto activated
Shield Advanced:
- $3000/mo
- Protect against more sophisticated DDoS attacks
- 24/7 access to DRP (DDoS response team)
- protect against higher fees during usage spikes due to DDoS
Web Application Firewall (WAF)
- Protects your web app from common web exploits (Layer 7)
- Layer 7 is HTTP (Layer 4 is TCP)
- Can ONLY be deployed on ALB, API Gateway, CloudFront
To use, you need to define a Web ACL (Access Control List):
- Rules can include IP addresses, http headers, http body, or URL strings
- protects from common attacks like SQL injection and cross-site scripting (XSS)
- Size constraints, geo-match (block countries)
- Rate-based rules (to count occurences of events) for DDoS protection (ex: can say a specific IP can’t make more than 5 calls in 60 seconds)
Firewall Manager
Manage rules in all accounts of AWS Organization
- WAF Rules
- Shield advanced
- security groups
GuardDuty
- Intelligent Threat Discovery to protect AWS Account
- Uses Machine Learning algorithms, anomaly detection, 3rd party data
- Can protect against crypto currency attacks (has a dedicated finding for it)
- Monitors VPC Flow Logs, CloudTrail Logs, DNS Logs
Inspector
- Automated Security Assessments for EC2 instances (ONLY for EC2)
- Analyze the running OS against known vulnerabilities
- Analyze against unintended network accessibility
- AWS Inspector agent must be installed on OS in EC2 instance
Once installed, it will assess then get you a report. Can send notification to SNS
Macie
- Fully managed.
- Uses machine learning and pattern matching to discover and protect your sensitive data in AWS
- Helps ID and alert you to personally identifiable info (PII)
CIDR - IPv4
CIDR = Classless Inter-domain Routing Help identify IP ranges - ww.xx.yy.zz/32 = 1 IP - 0.0.0.0/0 = all IPs - 192.168.0.0/26 = 192.168.0.0 through 192.168.0.63 (64 IP addresses)
CIDR - IPv4
Consists of 2 components:
Base IP
- Represent an IP contained in the range
- Example: 192.168.0.0
Subnet Mask - Defines how many bits can change in the IP - Example /0, /24, /32 - Can take 2 forms: } /8 = 255.0.0.0 } /16 = 255.255.0.0 } /24 = 255.255.255.0 } /32 = 255.255.255.255
Subnet Masks
The /number is part of a math equation to determine how many IPs we can have.
Example:
192.168.0.0/32 = 1 IP because 32 = 0 in the 2^0 equation and 2^0 = 1
192.168.0.0/31 = 2 IP because 31 = 1 in the 2^1 equation and 2^1 = 2
As the /number decreases by one, the exponent in the equation is doubled.
So 192.168.0.0/30 = 2^2, therefore, 4 IPs
/29 = 2^3 = 8 IPs
So the IP range all the way to /24f is:
- 192.168.0.0 through 192.168.0.255 (256 IPs)
If you drop below 24, then the next 0 starts to change. So /16 would be 192.168.0.0 through 192.168.255.255 (65,536 IPs)
Below 16 starts to change the 168. And all the way down to 0, which is 0.0.0.0 through 255.255.255.255, or all IPs
Public vs Private IP
The Internet Assigned Numbers Authority (IANA) established certain blocks of IPv4 addresses for the use of private (LAN) and public (internet) addresses
Private IP can only allow certain values:
- 10.0.0.0 through 10.255.255.255 (10.0.0.0/8) in big networks
- 172.16.0.0 through 172.31.255.255 (172.16.0.0/12) AWS default VPC in that range
- 192.168.0.0 - 192.168.255.255 (192.168.0.0/16) Home networks
All other IPs will part of the public internet
Default VPC
All new AWS accounts have a default VPC
New EC2 instances are launched in this default VPC unless otherwise specified
Default VPC has internet connectivity and all EC2 instances inside it have public IPv4 addresses
We also get a public and a private IPv4 DNS names
VPC in AWS - IPv4
- VPC = Virtual Private Cloud
- Can have a max of 5 in an AWS region (can have more upon request though)
- Max CIDR per VPC is 5
- For each CIDR, Minimum size is /28 (16 IPs) and max is /16 (65,536 IPs)
- Since VPC is private, only the Private IPv4 ranges are allowed:
- 10.0.0.0 - 10.255.255.255 (10.0.0.0/8)
- 172.16.0.0 - 172.31.255.255 (172.16.0.0/12)
- 192.168.0.0 - 192.168.255.255 (192.168.0.0/16)
Your VPC CIDR should NOT overlap with your other networks (eg corporate network)
Subnet (IPv4)
AWS reserves 5 IP addresses in each subnet. The first 4 and the last one. These 5 IPs cannot be used by you
Example: If CIDR block is 10.0.0.0/24 then the reserved IPs are:
- 10.0.0.0 Network Address
- 10.0.0.1 for the VPC router
- 10.0.0.2 for mapping to amazon provided DNS
- 10.0.0.3 not used currently, but still reserved
- 10.0.0.255 Network Broadcast Address, however, AWS does not support broadcast in a VPC, so AWS doesn’t allow it’s use
So remember this if asked a question that designates a required number of usable IP addresses.
Internet Gateway
Allows resources in a VPC to connect to the internet. They don’t GIVE internet access, just allow it. Route tables must be edited to complete the access to the internet.
Only 1 VPC per internet gateway, and vice versa
Bastion Hosts
This is an EC2 instance in a public subnet that acts as a passthrough to an instance in a private subnet. So a user would connect to the bastion host EC2 instance, then from there, could SSH into the private EC2 instance
Need to make sure security groups are set up properly to allow this connection to be made
NAT Instances (outdated, but might still be on the exam)
- NAT = Network Address Translation
- Allows EC2 instances in private subnets to connect to the internet
- Must be launched in a public subnet
- Must disable EC2 setting: “Source/destination check”
- Must have elastic IP attached to it
- Route tables must be configured to route traffic from private subnets to the NAT instance
NAT Gateway
- AWS Managed NAT Instance
- Pay per hour of usage and bandwidth
- Created in a specific AZ
- Uses an Elastic IP
- Can’t be used by an EC2 within the same subnet
- No security group setup required
- Redundant in a single AZ, but for fault tolerance, must create multiple NAT Gateways across multiple AZs
- Can’t be used as a bastion host
DNS Resolution in VPC
DNS Resolution (enableDnsSupport)
- Decides if DNS resolution from Route 53 Resolver server is supported for the VPC
- True (default) means it queries the Amazon Provider DNS Server at 169.254.169.253 or the reserved IP address at the base of the VPC IPv4 network range plus two (.2)
DNS Hostnames (enableDnsHostnames)
- By default it is true for the default VPC, and false for newly created VPCs
- Won’t do anything unless enableDnsSupport is true
- If True, assigns public hostname to EC2 instance if it has a public IPv4
The reason you want to do this is so that an EC2 in the public subnet can connect with the EC2 instance in the private subnet via a DNS name such as web.mycompany.private. This is called a private hosted zone
NACLs & Security Groups
- One NACL per subnet, but multiple subnets can be covered by a single NACL, new subnets are assigned the default NACL
- You define NACL Rules (rule numbers range from 1 - 32,766. Higher precedence with a lower number, so if rule 100 says allow, and rule 200 says deny, then allow will override the deny). Use a * which denies a request in case of no rule match
AWS recommends adding rules in increments of 100 so you have room to fit rules between other rules if needed
Default NACL
This one accepts everything on both inbound and outbound. Don’t modify this one, just create new ones
Ephemeral Ports
- For any two endpoints to establish a connection, they must use ports
- Clients connect to a defined port, (ie Port 80 for HTTP) and expect a response on an ephemeral port
- But for the clients to get a response, the client needs a port for the responding server to connect to. This port is created by the client, and is ephemeral
- Different OS has different port ranges
NACL with Ephemperal Ports
If you have an EC2 instance in a public subnet trying to talk to an RDS database in a private subnet, and both subnets have their own NACL, then the request from the EC2 to the RDS is easy, just set up normal outbound rules on the public subnet and normal inbound rules on the private subnet. But for the RDS to send a response to the EC2, it needs to use the ephemeral port of the EC2. So the outbound rule on the Private subnet NACL needs to include a range of ports, and the public subnet NACL needs to have inbound rules to cover the ephemeral port range of the RDS
VPC Reachability Analyzer
Diagnostic tool that troubleshoots network issues between 2 endpoints.
doesn’t send packets, just builds a model and analyzes the configuration
VPC Peering
- Privately connect 2 VPCs using AWS network
- Must not have overlapping CIDRs
- Both VPCs must have this enabled to talk to each other
- Must update route tables in each VPC subnet to ensure the EC2 instances can talk to each other
- Can be cross account and cross region
VPC Endpoints (AWS Private Link)
- Private endpoints that allow you to connect to services within the cloud without having to go through the public internet
Two Types:
- Interface Endpoints: provision an ENI (private IP address) as an entry point (must attach a security group). This supports most AWS services
- Gateway Endpoints: Provision a gateway and must be used as a target in a route table. This supports S3 and DynamoDB
VPC Flowlogs
Capture info about IP traffic going into your interfaces:
- VPC level
- Subnet level
- ENI level
Help monitor & troubleshoot connectivity issues
Can send logs to S3 and cloudwatch logs
Site-to-site VPN
Want to connect AWS VPC and on-prem data cener
Needs:
- Virtual Private gateway (VGW) which is a VPN concentrator on the AWS side of the connection
- Customer Gateway (CGW) which is a software app or physical device on the customer side of the connection
The customer gateway can either be public, and use its own public IP address, or private and go through a NAT device and use the public IP address of the NAT device
Once set up, you need to enable route propagation for the VPC in the route table in your subnets
Lastly, in order for you to ping EC2 instances from on-prem, make sure the security groups around your EC2 instances allow inbound traffic on the ICMP protocol
VPN CloudHub
If you have multiple locations, each with an on-prem data center, and you want them all to talk to each other as well as the AWS VPC, you can set up CloudHub that will allow this communication through the VGW and VPN. It does go over the public network of course.
Direct Connect (DX)
Provides a dedicated PRIVATE connection from a remote network to your VPC
Direct Connect Gateway
If you want to set up a direct connect to one or more VPCs in many different regions (same account), you use this
Direct Connect Connection Types
Dedicated
- 1Gbps and 10Gbps capacity
- Physical ethernet port dedicated to a customer
- Request made to AWS first then completed by AWS Direct Connect Partners
Hosted
- 50Mbps, 500Mbps, to 10Gbps
- Connection requests are made via AWS Direct Connect Partners]
- Capacity can be added or removed on demand (so more flexible)
Often takes longer than a month to set up
Direct Connect encryption
The network is private, but the data is not encrypted
Use VPN alongside the direct connect for in transit encryption
Direct Connect Resiliency
High Resiliency for Critical Workloads - One connection at multiple locations
Maximum Resiliency for Critical workloads - Four connections across 2 locations. Achieved by separate connections terminating on separate devices in more than one location
Exposing your VPC to other VPC
Option 1: make it public
- goes through www public internet
- tough to manage access
Option 2: VPC peering
- must create many peering relations
- opens the WHOLE network
Option 3: AWS Private Link (VPC Endpoint Services)
- Most Secure and Scalable way to expose a service to many VPCs (your own, or other accounts)
- Does not require VPC peering, internet gateway, NAT, route tables etc
- Requires a network load balancer on the service VPC and ENI on the customer VPC
Transit Gateway
For having transitive peering between thousands of VPC, on-prem data centers. Basically a centralized service that all VPC and on prem data centers connect to, and whoever is connected to transit gateway can talk to each other.
- Can work cross region
- Share cross account using Resource Access Manager
- Route tables are used for limiting the connections
- Only service that supports IP Multicast
VPC Traffic Mirroring
Allows you to capture and inspect network traffic in your VPC
Example: You have a public EC2 instance being accessed, so it has inbound and outbound traffic. You can set up mirroring to also send a copy of the inbound traffic to a NLB pointing to other EC2 instances that have some sort of security software installed. Can set up filters if you don’t want ALL traffic mirrored
IPv6
IPv4 was for 4.3 billion addresses (they’ll be exhausted soon)
IPv6 can have 3.4x10^38 addresses
All are public, no private range
Format is x.x.x.x.x.x.x.x (a single x represent a hexadecimal that ranges from 0000 to ffff)
IPv6 in VPC
- If enabled, your EC2 instances will get at LEAST a private internal IPv4 and a public IPv6 address.
- CANNOT disable IPv4, but can IPv6
Egress only Internet Gateways
Only used for IPv6 (similar to a NAT gateway, but for IPv6 instead of IPv4)
RPO and RTO
Recovery Point Objective - A point in time you recover to. So if you do a daily backup at 12:00pm, and disaster happens at 2:00pm, then you will lose those 2 hours worth of work
Recovery Time Objective - So the disaster happened at 2:00pm, and it took 1 hour to get back up and running, that one hour is the RTO
Disaster Recovery Strategies
- Backup and Restore
- Pilot Light
- Warm Standby
- Hot Site/Multi Site Approach
(in order from slowest to fastest RTO)
Backup and Restore (High RPO)
Actual backups and just use them to recreate things. Pretty cheap
Pilot Light
- A small version of the app is always running in the cloud.
- Usually just the critical components of the app (like a pilot light is to a water heater)
- Very similar to backup and restore for everything else, but since it has the critical systems already running, it’s faster
Warm Standby
- Full system is up and running, but at minimum size
- When main systems fail, the warm standby will scale to production load and Route 53 will failover to this warm standby
Multi Site/Hot Site Approach
- Lowest RTO, but very expensive.
- The backup is a fully functional duplicate of everything
Database Migration Service (DMS)
- Quickly and securely migrate databases to AWS
- The source database remains available during the migration
- Supports both homogeneous and heterogeneous migrations
- Continuous data replication
- You must create an EC2 instance to perform the replication tasks
AWS Schema Conversion Tool (SCT)
- Used when the source database and the destination database are using different engines
- You don’t need SCT if migrating the same engine
DMS Continuous Replication
Example: You have an Oracle DB on-prem and want to migrate to a MySQL DB on RDS. You set up a server on-prem with SCT installed, and that converts the oracle schema to mySQL and puts that into the RDS DB. You then have an EC2 instance with the DMS software installed do the actual data migration
DataSync
- Used to move large amount of data from on-prem to AWS.
- Can sync to S3 (including glacier), EFS, FSx for Windows
- Moves via the NFS or SMB protocols
- Can be scheduled to replicate hourly, daily, weekly
- Leverage the DataSync agent to connect the systems
- Can setup bandwidth limit
AWS Backup
- Fully managed backup service
- Supports FSx, EFS, DynamoDB, EC2, EBS, RDS (all engines), Aurora, Storage Gateway (Volume Gateway)
- Supports cross region and cross account backups
- On-demand or scheduled
- Can create a backup based on tags
S3 Events
Can only trigger SNS, SQS, or Lamda
High Performance Computing (HPC)
Data Management & Transfer
- AWS Direct Connect: Mopve GB/s of data to the cloud over a private secure network
- Snowball & Snowmobile: Move PB of data to the cloud
- AWS DataSync: move large amount of data between on-prem and S3, EFS, FSx for Windows
High Performance Computing (HPC)
Compute & Networkiing
- EC2 instances: CPU or GPU optimized. Spot instances/spot fleets for cost savings and auto scaling
- EC2 placement group type “Cluster” for good network performance
- EC2 enhanced networking (SR-IOV): higher bandwidth, higher packet per second (PPS), lower latency
} Option 1: Elastic Network Adapter (ENA), up to 100 Gbps
} Option 2: Intel 82599 VF, up to 10 Gbps (legacy) - Elastic Fabric Adapter (EFA): Improved ENA for HPC, ONLY for Linux. Great for inter-node communication, tightly coupled workloads as it leverages Message Passing Interface (MPI) standard which bypasses the Linux OS for lower latency and better reliability
High Performance Computing (HPC)
Storage
Instance attached storage:
- EBS
- Instance Store
Network Storage:
- S3
- EFS
- FSx for Lustre (HPC optimized, backed by S3)
High Performance Computing (HPC)
Automation and Orchestration
AWS Batch
- supports multi-node parallel jobs, which enables you to run single jobs that span multiple EC2 instances
- Easy to schedule
AWS Parallel Cluster
- cluster management tool
- automates creation of VPC, Subnet, cluster type, instance type
CICD
Continuous Integration. A way for developers to continuously write code, push it to the test server, and each time there is a passing version on the test server, it is pushed to the live server
CloudFormation
- Infrastructure as code
- Create a code that automates the provisioning of everything you want, exactly as you want
- You upload a template to S3, then reference it in cloudformation.
- Templates are YAML files
- Stack Sets allow you to create across multiple regions or accounts
Step Functions
- Build serverless visual workflow to orchestrate your Lamda functions (this lamda triggers this lamda which triggers this other lamda)
- STATE MACHINE
SWF
- Simple Workflow Service
- Almost the same as Step Functions, but for EC2 instead of Lamda, and not serverless
- Old
EMR (Elastic Map Reduce)
- Helps create Hadoop clusters (big data) to analyze and process vast amount of data
- can be made of hundreds of EC2 instances
- takes care of all provisioning for you
OpsWorks
- Chef and Puppet help you perform server config automatically, or repetitive actions
- They work great with EC2 and on-prem VM
- OpsWorks is just managed Chef and Puppet
AWS Workspaces
- Managed, secure, Cloud Desktop
- On demand (pay per usage)
- Secure, encrypted, network isolation
- Integrated with Microsoft Active Directory
- Replaces VDI (Virtual Desktop Infrastructure)
AppSync
- Syncs data across mobile and web apps in real time
- GraphQL
Cost Explorer
Visualize, understand and manage your AWS costs and usage over time
Choose a savings plan
Forecast up to 12 months in the future