First Batch Flashcards
ECR
Elastic Container Registry - Helps manage containers, pay for what you use
Interface Endpoint
There are two types of VPC endpoints: Interface Endpoints and Gateway Endpoints. An Interface Endpoint is an Elastic Network Interface with a private IP address from the IP address range of your subnet that serves as an entry point for traffic destined to a supported service. A Gateway Endpoint is a gateway that you specify as a target for a route in your route table for traffic destined to a supported AWS service. The following AWS services are supported: Amazon S3 and DynamoDB. You must remember that only these two services use a VPC gateway endpoint. The rest of the AWS services use VPC interface endpoints.
Gateway Endpoint
There are two types of VPC endpoints: Interface Endpoints and Gateway Endpoints. An Interface Endpoint is an Elastic Network Interface with a private IP address from the IP address range of your subnet that serves as an entry point for traffic destined to a supported service. A Gateway Endpoint is a gateway that you specify as a target for a route in your route table for traffic destined to a supported AWS service. The following AWS services are supported: Amazon S3 and DynamoDB. You must remember that only these two services use a VPC gateway endpoint. The rest of the AWS services use VPC interface endpoints.
Simple Workflow Solution
This is similar to Step Functions in that it helps you manage and organize lambda functions but - Simple Workflow service: Not exactly server less , older version, more involvement, not really used any more, ask DO YOU NEED CHILD PROCESSES
Recovery Point Object
This is dealing with Disaster Recovery - it is about how recently before the incident occurred did the data get backed up. The time between this and the incident is correlated to the ‘Data Loss’.
Recovery Time Objective
This is dealing with Disaster Recovery.
It is the amount of time after an incident occurs will the service be back to being functional. The time between this and the disaster is known as ‘Down Time’.
EMR
Elastic Map Reduce: Helps create Hadoop clusters to analyze and process a vast amount of data, it will use EC2 instances to do this and it will know how to organize the EC2 instances to do this
ECS
Elastic Container Service: This is an AWS service to manage, create and organize your containers. The basic version of this involves organizing and setting up EC2 instances. You have to delegate and create your EC2 instances, apply a ECS agent and then you can add tasks to these instances based on how much workload they can handle.
From AWS website: ‘fully managed container orchestration service that helps you easily deploy, manage, and scale containerized applications.’
ECS Agent
An ECS Agent is a software/packet/ that you run/place inside of an EC2 instance so that EC2 instance can connect and work with ECS to run tasks and coordinate as needed to complete the desired outcome
From AWS website: ‘The Amazon ECS container agent allows container instances to connect to your cluster. The Amazon ECS container agent is included in the Amazon ECS-optimized AMIs, but you can also install it on any Amazon EC2 instance that supports the Amazon ECS specification. The Amazon ECS container agent is only supported on Amazon EC2 instances.’
Cache
Cache is ‘in-memory’ storage that is quick and helps maintain easy access to frequently accessed data.
ElastiCache
in memory, elastic cache requires high code changes, elastic ache does not use IAM Authentication, Redis Auth, SSL supported.
Has two version: Redis and Memcached
RedShift
RedShift is a service that is used to do analysis on large datasets. RedShift is a node based service that has between 1 - 120 nodes and each node can have up to 160GB of data. RedShift is incredibly quick and is even faster than Athena for queries and analysis thanks to the use of indexes.
Extra: It is not Multi AZ, data can be imported fro S3
RedShift Spectrum
From AWS website: feature within Amazon Web Services’ Redshift data warehousing service that lets a data analyst conduct fast, complex analysis on objects stored on the AWS cloud. With Redshift Spectrum, an analyst can perform SQL queries on data stored in Amazon S3 buckets.
More power without loading data into S3
Athena
It is a analytics data warehouse that is very fast and powerful. It is more specifically used for S3 data analysis and queries.
Glue
It is an ETL that is used to prepare data for analysis, fully server less and is often used to send data to redshift
ECS Agent and AMIs
You can more quickly/automate ECS Agent set up by having AMIs that can incorporate them.
ELB
Elastic Load Balancer - is the same thing as Classic Load Balancer
CICD
Continuous Integration and Continuous Deployment
CloudFormation
Currently we have done a lot of manual code, WHAT IF OUR INFRASTRUCTURE WAS SET UP IN OUR CODE, CF is a declarative way of outlining your AWS infrastructure, ex: I want a security group, 2 EC2, and EFS, etc., Cf does it in right order and correct way , it will tell us cost for each component and the estimated code, we can restart and recreate infrastructure quickly, stacks to create and set things quickly, templates are updated in S3
Data Sync
Data Sync is a service used to quickly move data on to AWS.
AWS DataSync is an online data transfer service that simplifies, automates, and accelerates moving data between on-premises storage systems and AWS Storage services, as well as between AWS Storage services. You can use DataSync to migrate active datasets to AWS, archive data to free up on-premises storage capacity, replicate data to AWS for business continuity, or transfer data to the cloud for analysis and processing. Can go to S3, EFS, FSx for Windows File Server
Database Migration Service
AWS Database Migration Service (AWS DMS) helps you migrate databases to AWS quickly and securely. The source database remains fully operational during the migration, minimizing downtime to applications that rely on the database. The AWS Database Migration Service can migrate your data to and from the most widely used commercial and open-source databases.
All Migration services
1) Backup and Restore
2) Pilot Light
3) Warm Standby
4a) Hot Site/Multi Site 4b)same
ElasticBeanstalk
Elastic Beanstalk is the fastest and simplest way to deploy your application on AWS. You simply use the AWS Management Console, a Git repository, or an integrated development environment (IDE) such as Eclipse or Visual Studio to upload your application, and Elastic Beanstalk automatically handles the deployment details of capacity provisioning, load balancing, auto-scaling, and application health monitoring.
CloudFront
This is a cache on the global network that provides cache information based on what style of cacheing the developer selects. Additionally, CloudFront makes use of things such as Signed URLs, OAI for S3, is HTTP, if integrated with EC2 either the the EC2 has to be public or the ALB that connects to it has to be public and then given permission to access the S3, and it has GEORESTRICTIONS
EC2 Metrics
Standard is 5 min resolution, high resolution is 1 minute
CloudWatch Features
Logs, Dashboards, Alarms, Agents, Custom Metrics
CloudWatch Logs
VPC Flow Logs, Elastic Beanstalk, Route 53, ECS
Cloud Watch Custom Metrics Resolution
1/5/10/30s
CloudWatch Alarm Resolution
10/30/60s
CloudWatch connection to EC2
CloudWatch can be used to have metrics being measured on S3 and then we can see that when it is unhealthy we can make the EC2 to restore
CloudWatch Agent
A cloud watch agent is an file/process that is run on EC2 that allows the EC2 to connect and send data to cloudwatch so that we can process the data there
S3 Event Notification Possible Outsourcing Locations
SNS, SQS, Lambda
Elastic Load Balancers do what…
They distribute load across machines
What are some of the key offerings/services relating to EC2
- Rent EC2
- Store on EBS
- Scale the number using ASG
- Distribute load across machines using ELB
OS for EC2s
Linux, Windows, Mac OS
NEtwork Attached EC2 storage
EBS and EFS
EC2 user Data
Bootstrap script (configure at first launch) for EC2
From AWS Website: you have the option of passing user data to the instance that can be used to perform common automated configuration tasks and even run scripts after the instance starts. You can pass two types of user data to Amazon EC2: shell scripts and cloud-init directives.
Bootstrapping means…
launching commands when a machine starts, the script is only run once at the instance first start, the more you add the more it has to do at boot, ex: install updates, download thing from internet, install software, etc.
Security groups
- contorl how traffic is alowed into or out of EC2
- only contain allow rules
- can reference by IP or security group
- they are a firewall on our EC2 instances
- they regulate access to ports, authorized IP ranges, control of inbound network and control of outbound network
What do security groups regulate
they regulate access to ports, authorized IP ranges, control of inbound network and control of outbound network
Security groups can be attached to multiple instances true or false
T
Security groups are locked to a region, T or F
T
Security groups do not live outside EC2
F
Security groups cannot regulate/block based on other security groups: T or F
F
Port 22
log into a instance (linux, EC@)
Port 21
upload files into a file share
port 80
access unsecured websites
port 443
access secured websites
What OS’’s can you SSH into (EC2 instances)
Mac, Linux, Windows >= 10
Putty is used for
Windows (all versions)
EC2 instance connect
- use we-browsers and can connect to any OS
EC2 Purchasing Options
- On Demand: short workload and predictable
- Reserved (min 1 year)
- Spot Instances: Short, workloads and can lose the connection
Dedicated Hosts: Book an entire physical server, control instance placement
Types of Reserved instances
Reserved: long workloads
Convertible Reserved Instances: Long workloads with flexible instances
Scheduled Reserved: every Thursday between 3 and 6
On Demand
Short term and un interrrupted workloads
Reserved instances
up to 70% discount
1 year you get discount
3 year you get even more discount
purchasing options: up front savings, and all up front even more savings
Reserve a specific instance type
good for steady state usage
convertible allows you to change the instance types
Spot Instances
up to 90% discount
can lose them if the current sport price is more than your max price
the most cost-efficient instances
batch jobs
data analysis
image processing
You declare max spot price
the hourly spot price varies based on offer and capacity
you can choose to stop or terminate your instance within a 2 minute grace period
Dedicated Host
3 years, expensive, useful for software that have complicated licensing model
companies that have strong regulatory or compliance needs
Dedicated Instances
instances running on hardware thats dedicated to you
you don’t get access to underlying hardware
per instance billing as opposed to per host billing for dedicated hosts
no control over instance placement
Spot Block
Block spot instance during a specific time frame, think without interruptions
The instance may be reclaimed in very very rare instances
The process and details of how to terminate spot Instances
Spot request: has max price, desired number of instances, launch specifications, request type, valid from, valid til
If you instance gets stopped the spot request is back to getting you a instance
If you want to cancel a spot request, it has to be in open, active, or disabled
cancelling a spot request does not terminate the instance
you must first cancel a spot request then you terminate the associated spot instance
Spot Fleets
set of spot instances + (optional) On Demand Instances
the spot fleet will try to meet the target capacity
Strategies to allocate Spot instances: lowestPrice, diversified, capcaityOptimized
Spot Fleets allows us to automatically request Spot Instances with the lowest price
GIVES US SAVINGS BECAUSE IT CHOOSES THE RIGHT SPOT INSTANCE POOL TO CHOOSE FROM
IPv4 vs IPv6
IPv4 is more common and IPv6 is IOT
EC2 Placement Groups
Cluster: instances a in low latency in single AZ, if rack fail then all instances fail, big data that needs to be done fast
Spread: spread across difference hardware, max is 7 instances per group per AZ, all instances are in different hardwares and the instances/hardware are failed, so because spread on different hardwares, then within an AZ a hardware failure doesn’t mean that other instances fail, god for critical workloads
Partition: spread instances across many different partitions, within an AZ, scales to 100s of instances per group, we create partition where each partition is a different rack and so the racks are safe from failure, each partition is safe from failure
Elastic Network Interface (ENI)
virtual netowrk cards, allow instances to connect to internet,
Each ENI: primary private IPv4, one or more secondary IPv4, one elastic
Bound to specific AZ
can create eni independently and attach them on the fly for failover
EC2 ‘Stop’: what is the most important aspect to consider when you end a EC2 with the ‘Stop’ option
The data on disk is kept intact in the next start
EC2 ‘Terminate’: what is the most important aspect to consider when you end a EC2 instance with the ‘Terminate’ option
any EBS volumes also set up to be destroyed is lost, secondary (not meant to be destroyed) not destroyed
What is an EC2 Start look like, what are the steps that are taken in starting up an EC2 instance
First Start: OS boots and the EC2 User Data script is run
following starts: the OS boots up
then your application starts, caches get warmed up and that can take time
What does an EC2 Hibernate look like, what are some of the most important aspects to remember for an EC2 Hibernate
RAM is preserved
when you restart the instance boot is much faster (The OS is not stopped / restarted )
the whole RAM is written to EBS root volume
The root EBS Volume must be encrypted,
Then it is stopped when the Root instance are not deleted
Good to know: Max RAM is 150GB, root volumne must be encrypted EBS, and only works for On Demand and Reserved Instances and cannot be hibernated for more than 60 days
EBS
- network drive that you can attach to you EC2, can only be mounted to one EC2 at a time, one specific AZ, think ‘usb stick’
- since network there might be a bit of latency, it can be detached from an EC2 instance and attached to another
- provision capacity but can increase capacity
- can attach 2 EBS to 1 EC2
- can leave completely unattached
EBS Snapshots
make a backup of your EBS volume at a point in time,
not necessary to detach volume to do snapshot, but recommended,
can copy snapshots across AZ or region
AMI
Amazon Machine Image
customization of EC2 instance
you add your own software, config, OS
faster boot / configuration time because all your software is pre packaged
built for specific region but can be copied across regions
3 kinds: public, your own, AWS Marketplace AMI
Steps to make an AMI: launch EC2 with your configurations, stop, build AMI (create it officially) , make more EC2 from that AMI
EC2 instance store
EBS volumnes are network drives with good but “limited performance
if you need a high performance hardware disk, use EC2 instance store
better I/O, lose their storage when stoped, good for buffer, risk of data loss if hardware fails
EBS Multi Attach Family
io1/io2 family
EBS encryption
you should do since has minimal impact, it is handled transparently (you have nothing to do), leverages keys from KMS
EFS
Elastic File System: managed NFS, can be mounted to many EC2, more expensive, highly available, scalable
You need to use security groups to access EFS
encryption at rest KMS
Scale: 1000s of current NFS, high throughput
performance Mode: General purpose, Max I/O
Throughput Mode: Bursting and Provisioned
EFS Modes
Performance:
1) General
2) I/O
Throughput:
1) Burting
2) Provisioned
Load Balancing
Load balances are servers that forward traffic to multiple servers downstream
spread load
one point of access
seamlessly handle failures
enforce stickiness
high AZ
Elastic Load Balancer is a….
managed load balances
aws guarantees that it will be working, takes care of upgrades, and provides only a few configs kno
easier to handle
IT DOES HEALTH CHECKS
Elastic Load Balancers Health Checks
crucial for load balancers
they enable the load to know if the instances it is forwarding traffic to are available to ‘reply’
the health check is done on a port
checks for 200
Load Balancer Securiy Groups
Before Load Balancers, the EC2 Security dealt with a range of IPs, now it is looking for another security group, the security group of the Load Balancer.
Users can reach Load Balancer from anywhere, allow users to connect
EC2 should only allow data from Load Balancer
ALB
Layer 7, load balancing to multiple HTTP apps across machines (machines grouped on something called target groups)
Same thing ^^^ but on same machines (containers)
redirects
routing tables to different target groups
ALB tables and different target groups options: what are the different ways that targeting can be set on an ALB
Path based routing: example.com/users and example.com/owners
hostname based routing:
one.example.com and other.example.com
Query based routing
example.com/users?id=123&orders=false
quick review: path, hostname, query string and headers
port mapping feature to redirect to a dynamic port in ECS
ALB Target Groups
EC2, ECS tasks, Lambda functions, IP addresses
ELB Features
Sticky Sessions (same client same instance), Cross Zone Load Balancing (balance between all AZs), SSL Certificates (), Connection Draining ()
SSL/TLS Certificate
SSL Certificate allows traffic between your clients and your load balancer to be encrypted in transit
Secure Socket Layer
TLS is the newer version and TLS are mainly used
the data is encrypted over public before lb, and then once inside it becomes unencrypted in the private VPC HTTPS -> HTTP
Load balancer uses certificate, you can manage certificate using AWS certificate Manager, can create upload you own certificates
Load Balancer SSL certificates
Load balancer uses certificate, you can manage certificate using AWS certificate Manager, can create upload you own certificates
SNI
multiple SSL certificates on ALB
ELB Connection Draining
CLB connection draining
ALB and ELB is DeRegistration Delay
can be disabled if value is set to 0
stops sending new requests to the EC2 instance
set value low if requests are short
ASG
Auto Scaling Groups: The load balancer will automatically add the extra EC2 instances when scaled out, has scaling policies and scaling alarms (cloudwatch alarm) and when the alarm goes off it will scale off or in (The alarm decides whether we are going to scale in or out),
New rules: beter auto scaling rules (THIS IS MANAGED BY EC2):
Brain dump: scaling policies can be on CPU, Network, custome metrics or secheudle
ASGs use Laynch configs or launch template, TAM roles attached to an ASG will get assigned ot EC2, ASG are free, having instances under an AG means that if something goes wrong and one is terminated it will automatically be added back by the ASG, ASG can terminate instances marked as unhelathy by an LB
Launch Configuration has…
AMI, instance type, EC2 User Data, EBS Volumes, Security Groups, SSH Key Pair
Scaling Policies
Customer metrics are important
Scaling cooldowns are a scaling activity happens you are in the cooldown period , during this period the ASG will not launch new instances
RDS
Relational Database Service: DB use SQL as query language
RDS is managed: provisoning, OS patching, continous backups and restore, monitoring dashboards, RR for improved read, multi AZ set up for DR, maintenance windows for upgreades,
CANNOT SSH into instance
Backups in RDS
Automatically enabled in RDS
Transaction logs are backed up by RDS every 5 min,
Daily full backup of the DB
7 day retention
RDS snapshots vs Backups
Snapshots are manually done by user
RDS Auto Scaling
Enable the feature
helps you increase storage on your RDS DB instance dynamically
when RDS detects you are running out of free DB storage it scales automatically
must set max storage threshold
good for unpredictable workloads
recap: enable, will auto scale, will do it for you, must set data max limit
RDS Read Replicas
We can create up to 5 read replicas
the read replicas can be within AZ, Cross AZ, or Cross Region,
Async
Scale reads
can promote to own Database and will live and have it’s own lifecycle
use case: need to do performance and analytics on RDS, then create read replica and set it to be where that is the analytics is done
read replica is only for SELECT
free if the read replica in same region
RDS Multi AZ
It is for disaster recovery
Sync replication
one DNS name
INCREASE AVAILABILITY, IT IS A STAND BY (SO IT IS NOT READ FROM UNLESS FAILURE OF MASTER), NOT INCREASING SCALING, FOR FAILOVER ONLY
RDS from single AZ to Multi AZ
Zero downtime operation
click modify
following happens internally
- snap shot take, new DB restored from SS in new AZ, sync established
RDS encryption
At Rest, In flight, KMS is used, SSL, SSL certificate,
Encryption process of unencrypted to encrypted
SS = Screen Shot
SS, copy SS and enable encryption, restore the DB from the Encrypted, migrate applications to new DB, delete old
summary: SS, copy SS as Encrypted, files in new, apps to new, delete old
Where are RDS DB usually deployed
Private subnet
RDS - IAM Auth
NEED TO STUDY
Aurora
cloud optimied, postgres and MySQL DB,
storage automatically grows in increments of 10GB up to 64TB
can have 15 replicas
failover is instance
native HA
automatically start in 3 AZ
1 master that is a write
Failover in Aurora
Instant, native HA
Writer Endpoint
Think Aurora, this is where the client connects to aster to write
Reader Endpoint
Think Aurora, this is where the read replicas from Aurora send their read data to the user
All Aurora advanced featuers
Auto scaling,
Global (one primary region, but we get 5 secondary regions - replication lag is less than 1 second, promoting another region, for disaster recovery is an RTO of < 1 min),
custom endpoints (usually goes with different instance types),
serverless (good for intermitten, infrequent and unproductive workload),
Multi Master( this is HA for wrier node, all become Writers as well as readers)
Machine Leanring
Global Aurora
- 1 primary region
- up to 5 secondary regions
- up to 16 RR per secondary regions
- helps with decreasing latency
- replication lag (data updates to secondary regions, takes up to 1 second)
- promoting another region has an RTO of < 1 minute
S3
- infinitely scaling storage
- max object size is 5TB
- versioning
- encryption: SSE-S3, SSE-KMS, SSE-C, Client Side Encryption
S3 Objects and Buckets
- objects are files, buckets are directories
- buckets must have a globally unique name
- objects have files = key
- the key is the Full path
- key can be split into prefix and object name, the last part
- there are no directories in S3,
it’s just keys with long names
S3 versioning
- versioning (enabled at bucket level, new versions, we version files)
S3 Encryption
encryption: SSE-S3, SSE-KMS, SSE-C, Client Side Encryption
SSE-S3: keys handled and managed by S3, Object encrypted server side, encryption type, must
(encrypted in S3)
SSE-KMS: encryption keys handled and managed by KMS
KMS advantage is user control and audit trail object is encrypted server side (encrypted in S3)
SSE-C: server side encryption using data keys fully managed by customer, outside of AWS, HTTPS MUST BE USED, key must be provided in HTTP headers for every HTTP request made (Encrypted in S3)
Client Side encryption: clients must encrypt data themselves before sending to S3, clients must decrypt when they receiver data
S3 is an HTTPS service so
HTTP endpoint : non enrypted
or
HTTPS: encryption in flight
most use HTTPS
HTTPS is mandatory for SSE-C