Services Flashcards

Question 1

Q

IAM

Explain:

IAM roles
IAM User
IAM Credentials Report
IAM Access Advisor

Answer

A

Identity and access management.

- IAM Role: some services need to perform actions on your behalf. An IAM Role is like a user, but intended to be used by an AWS service.

- IAM Credentials Report: lists all users and the state of their credentials
- IAM Access Advisor: shows the service level permissions granted to a user and when those services were last accessed

Question 2

Q

ENI

Explain:

What is an ENI bound to?
What attributes can an ENI have?

Answer

A

Elastic network interface.
- Bound to an AZ

- Can have the following attributes:
○ Each ENI can have one private ipv4, one or more ipv6
○ One elastic IP (ipv4) per private IPv4
○ One public IPv4
○ One or more security groups
A MAC address

Question 3

Q

AMI

Explain:
- What is an AMI bound to?

Answer

A

Amazon machine image.
Per region - need to copy it to new region if you want to transfer.
ID is region locked; sharing creates a new ID

Question 4

Q

EBS

What is an EBS bound to? How can it be moved?
How many attachments?
How is it provisioned?
How is it deleted?

Answer

A

Elastic Block Store

- Typically one EBS can be attached to one EC2 instance, but there's a multi-attach feature for some EBS (the ssd ones)
- Bound to a specific AZ
- To move a volume across AZ (or region) you need to snapshot it. It's not necessary to detach to snapshot, but it's recommended.
- GB and IOPS must be provisioned in advance
- "delete on termination" attribute that deletes a volume when its attached instance is deleted. This is the default for the root volume. Lowest latency compared to other options

Question 5

Q

EFS

What protocol?
How many attachments?
Advantages?
Classes
HA options?

Answer

A

Elastic File System

Can be mounted on many EC2. Multi-AZ.
Highly available, scalable expensive (3x cost of gp2 drive), pay per use
Use case: content management, web serving, data sharing, wordpress
Uses NFSv4.1 protocol
Scale:
- 1000s of concurrent NFS clients, 10 GB+/s throughput
- Grow to petabyte-scale network file system, automatically
Advantages: can mount to many instances, scales cap automatically.
Has the same classes as EC2 (IA, infrequent) with lifecycle management.
HA: Either “classic” or “one zone”. Classic is automatically replicated across multiple AZ so there’s more uptime, but more expensive.

____________
Performance mode (set at EFS creation time)
- General purpose (default): latency-sensitive use cases
- Max I/O: higher latency, thoughput, highly parallel (big data, media processing)
Throughput mode
- Bursting (based on current size)
- Provisioned: set your throughput regardless of storage size
Storage tier (lifecycle management - move after n days)
- Standard: for frequently accessed files
- Infrequent Access: cost to retrieve files, lower price to store

Question 6

Q

CLB

Answer

A

Classic Load Balancer
- Fixed hostname
- TCP or HTTP health checks
- No websockets, no http/2, no path-based routing, no multiple ports on a single instance, bunch of other small features
Really the only reason to use one are TCP/SSL listeners, support for EC2-classic, and support for sticky sessions using application-generated cookies (in ALB cookies are generated by the load balancer)

Question 7

Q

ALB

Fixed IP?
How does routing work?
What happens to instances when scaling in?
What target groups?

Answer

A

Application Load Balancer

- Layer 7
- Load balancing to multiple HTTP applications across machines ("target groups")
- Can also load balance to multiple apps on the same machine (like with containers)
- Support for HTTP/2, websockets, redirects
- Routing tables to diff target groups:
	○ Based on path in URL
	○ Based on hostname in URL
	○ Based on query strings or headers
- Great fit for microservices and container-based applications
- Port mapping feature to redirect to a dynamic port in ECS (elastic container service)
- Classic load balancer sucks for this - you'd need multiple ones per application
- Target groups:
	○ EC2 instances (can be managed by ASG)
	○ ECS tasks (managed by ECS itself)
	○ Lambda functions - HTTP request is translated into a JSON event
	○ IP addresses (must be private IPs)
- One ALB can route to multiple target groups
- Health checks are at target group level
- You get a fixed hostname (like classic)
- Applications don't see the IP of the client directly; it's sent as the X-Forwarded-For header
	○ Also x-forwarded-port and x-forwarded-proto (protocol)

Question 8

Q

NLB

Fixed IP?
Advantages?

Answer

A

Network Load Balancer
- Layer 4 (TCP / UDP)
- Extremely high performance
- Less latency than ALB (like 100ms vs 400ms)
- Unlike ALB, has one static IP per AZ and supports assigning elastic IP (helpful for whitelisting specific IP)
You want to use a NLB if you’re dealing with TCP/UDP traffic or you want extreme performance

Question 9

Q

ACM

Answer

A

AWS Certificate Manager

Can create certs through it or upload your own

Question 10

Q

RDS

Explain:

Data retention
Do you need capacity / instance type on creation?
Read Replicas
HA

Answer

A

Relational Database Manager

Automated Backups are automatically enabled in RDS. Daily full backup of the database (during maintenance window); transaction logs are backed up every 5 minutes. Can restore to any point in time (oldest backup to 5 minutes ago). 7 day retentions, up to 35 days.
Automated Snapshots are manually triggered, and can be kept indefinitely.
Storage Autoscaling: don’t need to manually scale database storage; can be done automatically. Just set a maximum storage threshold, and it will upgrade after crossing that point.
Automatically modify storage if:
- Free storage less than 10% of allocated storage
- Low-storage lasts at least 5 minutes
- 6 hours have passed since last modification

Read Replicas
- Up to 5 road replicas
- Within AZ, Cross AZ or Cross Region
- Replication is ASYNC, so reads are eventually consistent
- Replicas can be promoted to their own DB
- Applications must update the connection string to leverage read replicas
- Use cases:
- Classic use case is to create a read replica of a production database for analytics (this way you don’t add additional load to your production application)
- Obviously, read replicas are SELECT statements only
- Again, there is a network cost when data goes from one AZ to another.
Multi AZ
- Synchronous replication. Your RDS is referenced by a DNS name; if there’s a failure, the DNS will point to the replica instead.
- Doesn’t help with scaling, it’s just for HA.
- Read replicas can be set up as multi AZ for disaster recovery.
- Single-AZ to Multi-AZ:
- Zero downtime operation
- Just click on “modify” for the database to set it to Multi-AZ

Question 11

Q

Aurora

Explain:

Data retention
Do you need capacity / instance type on creation?
Read Replicas
Failover logic
Scaling / HA options
Custom Endpoints
Serverless
Machine learning

Answer

A

Proprietary, “AWS cloud optimized” SQL database compatible with Postgres and MySQL drivers.

- Proprietary, "AWS cloud optimized" SQL database compatible with Postgres and MySQL drivers
- 3-5x more performant than Postgres or MySQL
- Storage automatically grows in increments of 10GB, up to 64TB
- Can have 15 replicas (MySQL only has 5) and the replication process is faster (sub 10ms replica lag)
- Instantaneous failover. Always HA.
	- If it's a single instance, it will try to recreate in the same AZ as the original instance.
	- If it has replicas in a diff zone, aurora changes the CNAME to point to the healthy instance
	- If serverless (or its AZ) becomes unavailable, it will attempt to recreate in a different AZ.
- Costs 20% more than RDs, but is more efficient.
- 6 copies of your data across 3 AZ:
	- 4 copies out of 6 needed for writes
	- 3 copies out of 6 for reads
	- Self-healing with peer-to-peer replication
	- Storage is striped across 100s of volumes
- One Aurora Instance takes writes (master)
- Automated failover in less than 30s
- Master + up to 15 read replicas serve reads
- Support for cross region replication
- Writer endpoint points to the master
- Reader endpoint does load balancing at the connection level. When you access the reader endpoint, you're connected to one of the replicas.
- Backtrack: restore data at any point of time without using backups
- Encryption methods and rules are exactly the same as with RDS Aurora - Advanced Topics
- Auto Scaling: you can enable this to automatically scale up replicas if existing ones have high CPU usage
- Custom Endpoints: defined a subset of aurora instances as a custom endpoint. Useful if you have some replicas that are highly performant instances, and you want to run analytical queries against them.
	- The reader endpoint is generally not used after defining custom endpoints.
- Serverless:
	- automated DB instantiation and auto-scaling based on actual usage
	- Good for infrequent, intermittent or unpredictable workloads
	- No capacity planning needed; pay per second, can be more cost-effective
- Multi-master:
	- in case you want immediate failover for write node
	- Every node does R/W instead of promoting a RR as the new master
- Global Database:
	- Simple approach is to set up cross region read replicas.
	- Recommended approach is to use Aurora Global Database:
		§ 1 primary region (read/write)
		§ Up to 5 secondary (read-only) regions, replication lag is less than 1 second
		§ Up to 16 read replicas per secondary region
		§ Helps for decreasing latency
		§ Promoting another region (disaster recovery) has a RTO of less than a minute.
- Machine Learning:
	- Enables you to add ML-based predictions to your applications via SQL
	- Simple, optimized, and secure integration between Aurora and AWS ML services
	- Supported services:
		§ Amazon SageMaker (use with any ML model)
		§ Amazon Comprehend (for sentiment analysis)
	- Don't need to have ML experience
	- Use cases: fraud detection, ads targeting, sentiment analysis, product recommendations

Question 12

Q

ElastiCache

How is it provisioned?
Compare services and their HA and backup options
How does authentication work?

Answer

A

ElastiCache

specify EC2 instance type on lauch
- Managed Redis or Memcached
- Can’t be toggled with a button; requires heavy application code changes
- Should be obvious, but works like a typical cache; hits ElastiCache first; if there’s a cache miss, it retrieves that data from RDS and stores it in ElastiCache
- Need to implement your own cache invalidation strategy (e.g. a TTL)
- Common use case is storing session data
- Redis vs Memcached
  - Redis: Multi-AZ with auto-failover. Read replicas to scale reads and have high availability. Data durability using AOF (append only file) persistence. Backup and restore.
  - Memcached: multi-node for partitioning (sharding), no HA, no persistence, no backup and restore, multi-threaded architecture
The caches themselves do not support IAM authentication; that’s only for API-level security
- IAM policies on EC are only for AWS API-level security
can still use
- Redis AUTH
  - Can set password/token when you create a Redis cluster
  - Support SSL in flight encryption
- Memcached
  - Supports SASL-based authentication

Question 13

Q

Route53

CNAME vs Alias
7 routing policies

Answer

A

Simple:
○ Use when you need to redirect to a single resource
○ You can’t attach health checks to simple routing policy
○ If multiple values are returned, a random one is chosen by the client
- Weighted:
  ○ Control the % of requests that go to a specific endpoint
  ○ Use case: test 1% of traffic on new app version
  ○ Helpful to split traffic between two regions
  ○ Can be associated with health checks
- Latency:
  ○ One of the most useful routing policies
  ○ Redirect to the server that has the least latency close to us
  ○ Super helpful when latency of users is a priority
  ○ Latency is evaluated in terms of user to designated AWS Region
  § Germany may be directed to the US (if that’s the lowest latency)
- Failover:
  ○ Can only have one primary, and one secondary
  ○ Primary record must be associated with a health check
- Geolocation
  ○ Different from latency based!
  ○ This is routing based on user location
  ○ Here we specify: traffic from should go to this specific IP
  ○ Should create a “default” policy for when there’s no match on location.
- Geoproximity
  ○ Route traffic to your resource based on the geographic location of users and resources
  ○ Ability to shift more traffic to resources based on the defined bias
  ○ To change the size of the geographic region, specify bias values:
  § To expand (1 to 99) - more traffic to the resource
  § To shrink (-1 to -99) - less traffic to the resource
  ○ Resources can be:
  § AWS resources (specify AWS region)
  § Non-AWS resources (specify lat/long)
  ○ You must use Route 53 Traffic Flow (advanced) to use this feature
  ○ For exam, just know that it’s useful for shifting traffic from one region to another by changing the bias
- Multi-value:
  ○ Use when routing traffic to multiple resources
  ○ Want to associate health checks with records
  ○ Up to 8 healthy records are returned for each multi value query
  ○ Multivalue is not a substitute for ELB. Honestly, there seems to be very few reasons to use multi-value instead of a load balancer.

Question 14

Q

CloudFront

What is it?
What origins does it support?
CF vs S3 CRR?
Signed URL vs Signed Cookies vs s3 pre-signed url
What is OAI?

Answer

A

CloudFront
- CDN
- Improves read performance; content is cached at the edge
- 216+ edge locations
- DDOS protection, integration with shield, AWS Web Application Firewall
- Can expose external HTTPS and can talk to internal HTTPS backends
- Origins:
○ S3 bucket
§ Distributing files and caching them at the edge
§ Enhanced security with CloudFront Origin Access Identity (OAI)
□ Can use this to restrict s3 access to CloudFront (e.g. you only want files to be accessed through CloudFront, not s3 directly)
§ Can be used as an ingress to upload files to s3
○ Custom Origin (HTTP)
§ ALB
§ EC2 instance
§ S3 website (must first enable bucket as static website)
§ Any HTTP backend you want
- Geo Restriction: can whitelist or blacklist countries

CloudFront vs S3 Cross Region Replication
CloudFront is a global edge network, where files are cached for a TTL (maybe a day). Ideal for static content that must be available everywhere.
S3 CRR must be set up for every region you want replicated to. Real-time file updating, read only. Great for dynamic content that needs to be available at low-latency in few regions.

Signed URL / Signed Cookies
- Attach a policy with:
○ URL expiration
○ IP ranges to access the data from
○ Trusted signers (which AWS accounts can create signed URLs)
- How long should the URL be valid for?
○ Shared content (movie, music): make it short (a few minutes)
○ Private content (private to the user): you can make it last for years
- Signed URL = access to individual files (one signed URL per file)
- Signed Cookies = access to multiple files (one signed cookie for many files)
It works like so: user does authn/authz with application, application requests a signed url or cookie from aws, application returns url/cookie to user, then user can use that to make requests to aws directly.

CloudFront signed URL vs s3 pre-signed URL

CloudFront URL:
- Allow access to a path, no matter the origin (so it’s useful for any http/https connection, not just s3)
- Account wide key-pair, only root can manage
- Filter by path, IP, date, expiration
- Leverage caching features of CloudFront
S3 pre-signed URL:
- Issue a request as the person who pre-signed the URL
- Because of this, it uses the IAM key of the signing IAM principle
- Limited lifetime

In CloudFront, a signed URL allow access to a path. Therefore, if the user has a valid signature, he can access it, no matter the origin.
In S3, a signed URL issue a request as the signer user. When you sign a request, you need to provide IAM credentials, so accessing a signed URL has the same effect as that user would have done it.

Question 15

Q

CloudFront (part 2)

pricing
multiple origin?
origin groups
field level encryption

Answer

A

CloudFront Pricing
- Edge locations all over the world
- The cost of data out per edge location varies
- You can reduce the number of edge locations for cost reduction
- Three price classes:
○ All: all regions, best performance
○ 200: most regions, but excludes the most expensive regions
○ 100: only the least expensive regions
Multiple Origin
- To route to different kinds of origins based on the content type
- Based on path pattern:
○ /images/*
○ /api/*
○ /*
Origin Groups
- To increase HA and do failover
- Origin group: one primary and one secondary origin
- If the primary origin fails, the second one is used
- Example use case: s3 buckets with cross-region replication. If one is down, use the secondary one.
Field Level Encryption
- Protect user sensitive information through application stack
- Adds an additional layer of security along with HTTPS
- Sensitive information encrypted at the edge close to the user
- Uses asymmetric encryption
- Usage:
○ Specify set of fields in POST requests that you want to be encrypted (up to 10 fields)
○ Specify the public key to encrypt them
○ The edge location will encrypt the fields before that data is sent to any other AWS service

Question 16

Q

Global Accelerator

what is it?
how does it work?
What services does it work with?
Benefits
Security
GA vs CloudFront

Answer

A

Unicast IP: one server holds one IP address
- Anycast IP: all servers hold the same IP address and the client is routed to the nearest one
- Leverage the AWS internal network to route to your application
- 2 anycast IP are created for your application
- The anycast IP send traffic directly to edge locations
- The edge locations send the traffic to your application
- Works with elastic IP, EC2 instances, ALB, NLB, public or private
- Consistent Performance
  ○ Intelligent routing to lowest latency and fast regional failover
  ○ No issue with client cache (because the IP doesn’t change)
  ○ Internal AWS network
- Health checks:
  ○ Global Accelerator performs a health check of your applications
  ○ Helps make your application global (failover less than 1 minute for unhealthy)
  ○ Great for disaster recovery (thanks to the health checks)
- Security
  ○ Only 2 external IP need to be whitelisted
  ○ Automaticallly get DDOS protection thanks to AWS shield
  GA vs CloudFront
- Both use the AWS global network and its edge locations around the world
- Both services integrate with AWS shield
- CloudFront
  ○ Improves performance for both cacheable content (such as images and videos)
  ○ Dynamic content (such as API acceleration and dynamic site delivery)
  ○ Content is served at the edge
- Global Accelerator
  ○ Improves performance for a wide range of applications over TCP or UDP
  ○ Proxying packets at the edge to applications running in one or more AWS regions
  ○ Good fit for non-HTTP use cases such as gaming (UDP), IoT, (MQTT) or VOIP
  Good for HTTP use cases that require static IP addresses

Question 17

Q

AWS Snow

three types (and their subtypes), use cases, storage limits
what is edge computing?
OpsHub?

Answer

A

AWS Snow
- Highly-secure, portable devices to collect and process data at the edge, and migrate data into and out of AWS
- Data Migration: Snowcone, Snowball edge, snowmobile
- Edge computing: snowcone, snowball edge
- As a rule of thumb, if it takes more than a week to transfer over a network, use Snowball devices!
- Three types:
Snowball Edge
○ Physical data transport solution: move TBs or PBs of data in or out of AWS
○ Alternative to moving data over the network (and paying network fees)
○ Pay per data transfer job
○ Provide block storage and S3-compatible object storage
○ Snowball Edge Storage Optimized
§ 80 GiB of RAM
§ 80 TB of HDD capacity for block volume and s3 compatible object storage
§ Object storage clustering available
○ Snowball Edge Compute Optimized
§ 208 GiB of RAM
§ 42TB of HDD capacity
§ Optional GPU (useful for video processing or ML)
○ Use cases: large data cloud migrations, DC decommission, disaster recovery
Snowcone
○ Small, portable computing, anywhere, rugged and secure, withstands harsh environments (desert, underwater)
○ Light (4.5 lbs)
○ Used for edge computing, storage, and data transfer
○ 8TBs of usable storage
○ Use snowcone where snowball does not fit (space-constrained environment); can even be carried by drone
○ USB-C powered. Must provide own battery/cables.
○ Can be sent back to AWS offline, or connect it to internet and use AWS DataSync to send data
Snowmobile
○ It’s an actual goddamn truck
○ Transfer exabytes of data (1M TB)
○ Each snowmobile has 100 PB of capacity (can use multiple in parallel)
○ High security: temperature controlled, GPS, 24/7 video surveillance
○ Better than snowball if you transfer more than 10 PB
- Edge computing:
- Process data while it’s being created on an edge location (truck on the road, ship on the sea, mining station underground)
- These locations may have limited internet access, limited computing power
- We set up a Snowball edge / snowcone device to do edge computing
- Use cases: preprocess data, machine learning at the edge, transcoding media streams
- Eventually (if need be) we can ship back the device to AWS (for transferring data)
- Both snowball and snowcone can run EC2 instances or AWS lamba functions (using AWS IoT Greengrass)
- Long-term deployment options: 1 and 3 years discounted pricing
- AWS OpsHub
- Historically, to use Snow Family devices, you needed a CLI
- Today, you can use AWS OpsHub (software you install on your computer / laptop) to manage your snow family device

Question 18

Q

Storage Gateway

three use cases
describe the three types
hardware appliance?

Answer

A

“Hybrid Cloud”
- Can be due to:
  - Long cloud migrations
  - Security requirements
  - Compliance requirements
  - IT strategy
- S3 is a proprietary storage technology (unlike EFS / NFS), so how do you expose the S3 data on-prem? Storage Gateway!
- Use cases: DR, backup and restore, tiered storage
- Three types:
  - File Gateway
  - Volume Gateway
  - Tape Gateway
- File Gateway
  - Configured S3 buckets are accessible using the NFS and SMB protocol
  - Supports S3 standard, S3 IA, S3 One Zone IA
  - Bucket access using IAM roles for each File Gateway
  - Most recently used data is cached in the file gateway
  - Can be mounted on many servers
  - Integrated with Active Directory (AD) for user authentication
- Volume Gateway
  - Block storage using iSCSI protocol backed by S3
  - Backed by EBS snapshots which can help restore on-premises volumes
  - Cached volumes: low latency access to most recent data
  - Stored volumes: entire dataset is on premise, scheduled backups to S3
    - Useful if you need low-latency access to entire dataset
- Tape Gateway
  - Virtual Tape Library (VTL) backed by Amazon S3 and Glacier
  - Back up data using existing tape-based processes (and iSCSI interface)
  - Works with leading backup software vendors
- Hardware Appliance
  - Using storage gateway means you need on-prem virtualization
  - Otherwise, you can use a Storage Gateway Hardware Appliance
  - Works with FG, VG, TG
  - Has the required CPU, memory, network, SSD cache resources
  - Helpful for daily NFS backups in small data centers
- For exam:
  - On-prem data to the cloud -> think storage gateway
  - File access / NFS - user auth with AD -> file gateway (backed by s3)
  - Volumes / Block Storage / iSCSI -> volume gateway (backed by s3 with ebs snapshots)
  - VTL tape solution / backup with iSCSI -> tape gateway (backed by s3 and glacier)
    No on-premises virtualization -> hardware appliance

Question 19

Q

Amazon FSx for Windows

authn/authz?
use case?
HA?
Backups?

Answer

A

EFS cannot be used with Windows
- FSx for Windows is a fully managed Windows file system share drive
- Supports SMB protocol and Windows NTFS
- Active Directory integration, ACLs, user quotas
- Built on SSD, scale up to 10s of GB/s, millions of IOPS, 100s PB of data
- Can be accessed from on-prem infra
- Can be AZ
  Data backed-up daily to S3

Question 20

Q

Amazon FSx for Lustre

use cases

Answer

A

Completely unrelated, lol
- Lustre = linux + cluster. Used for large-scale computing.
- Machine learning, High Performance Computing
- Seamless integration with S3
  - Can “read s3” as a file system (through FSx)
  - Can write the output of the computations back to S3 (through FSx)
    Can be used from on-prem servers

Question 21

Q

AWS Transfer

use cases
auth?

Answer

A

Fully managed service for file transfers into and out of S3 or EFS using the FTP protocol
- Supported Protocols:
  - AWS Transfer for FTP
  - AWS Transfer for FTPS (FTP over SSL)
  - AWS Transfer for SFTP (Secure FTP)
- Managed infra, scalable, reliable, HA
- Pay per provisioned endpoint per hour + data transfers in GB
- Store and manage users’ credentials within the service
  Integrate with existing authentication systems (Microsoft Active Directory, LDAP, Okta, Amazon Cognito, custom)

Question 22

Q

SQS

What is?
Retention Period
Security, access controls
Message Visibility
Dead Letter Queue
Request-Response pattern
Delay Queue
FIFO Queue

Answer

A

Attributes:
○ Unlimited throughput
○ Default retention of messages: 4 days, maximum of 14 days
○ Low latency (<10ms on publish and receive)
○ Limitation of 256kb per message sent
- “At least once delivery”: can occasionally have duplicate messages
- “Best effort ordering”: messages can be out of order sometimes
- Producing Messages:
  ○ Produced to SQS using the SDK (SendMessage API)
  ○ The message is persisted in SQS until a consumer deletes it
  ○ Example: send an order to be processed
- Consuming Messages:
  ○ Consumers (running on EC2 instances, servers, AWS lambda)
  ○ Consumer polls SQS for messages (receive up to 10 messages at a time)
  ○ Process the messages (example: insert the message into an RDS database)
  ○ Delete the messages using the DeleteMessage API
  ○ Great way to scale up message processing as needed is by using an ASG that manages EC2 instances which poll for messages
  § Need some sort of message to determing when to scale out/in - CloudWatch has a metric called ApproximateNumberOfMessages (queue length). Set up a CloudWatch alarm to scale out/in.
- Security:
  ○ In-flight encryption using HTTPS API
  ○ At-rest encryption using KMS keys
  ○ Client-side encryption if the client wants to perform encryption/decryption itself
- Access controls:
  ○ IAM policies to regulate access to the SQS API
  SQS Access policies (similar to S3 bucket policies). Useful for cross-account access, or allowing other services to write to an SQS queue

SQS - Message Visibility Timeout

- After a message is polled by a consumer, it becomes invisible to other consumers-
- By default, the "message visibility timeout" is 30 seconds
- After that timeout is over, the message is "returned" an can be picked up by other consumers (so it could be processed twice)
- If the consumer is still working on it but needs more time there is a ChangeMessageVisibility API it can hit for more time
- Tradeoff of having a high visibility timeout is that if a consumer crashes, it can take a long time for it to be picked up again. If it's too low, you can get duplicate processing.

SQS - Dead Letter Queue

- We can set a threshold of how many times a message can go back into the queue (like if something about it is causing consumers to repeatedly fail)
- After the MaximumReceives threshold is exceeded, the message goes into a dead letter queue (DLQ). Useful for debugging a problem.
- Make sure to process the messages in the DLQ before they expire. Good to set a retention of 14 days in the DLQ.
- A DLQ is literally just an SQS queue. Personally, it seems like a good idea to set up CloudWatch metrics for that.

SQS - Request-Response Systems
- Create bidirectional flow between producers and responders. Include a “reply to” field in the request a producer sends. When the responder finishes processing that request, it sends a message to that “reply to” field (another SQS queue). The idea is that we can now scale out requesters as needed, not just scale out responders.
○ This is literally the same thing as backpressure in Elixir GenStage
- Need to know that you should use the SQS Temporary Queue Client to implement this pattern.
○ It leverages virtual queues instead of creating / deleting SQS queues (more cost-effective).

SQS - Delay Queue

- Delay a message (consumers don't see it immediately) up to 15 minutes
- Default is 0 seconds
- Can set a default at queue level
- Can override the default on send using the DelaySeconds parameter

SQS - FIFO Queue

- Limited throughput: 300 msg/s without batching, 3000 msg/s with
- Exactly-once send capability (by removing duplicates)

Messages are processed in order by the consumer

Question 23

Q

SNS

What can subscribe to SNS?
What can publish to SNS?
Limits?
Security / access controls?
FIFO
Message filtering

Answer

A

It’s just pub/sub
- The “event producer” only sends message to one SNS topic
- Can have as many “event receivers” (subscriptions) as we want to listen to the SNS topic notifications
- Each subscriber to the topic will get all the messages (note: new feature to filter messages)
- Up to 10M subscriptions per topic
- 100k topics limit
- Subscribers can be:
  ○ SQS
  ○ HTTP / HTTPS (with delivery retries)
  ○ Lamba
  ○ Emails
  ○ SMS Messages
  ○ Mobile Notifications
- SNS integrates with a lot of AWS services because many AWS services can send data directly to SNS for notifications
  ○ CloudWatch (for alarms)
  ○ ASG notifications
  ○ S3 (on bucket events)
  ○ CloudFormation (upon state changes => failed to build, etc)
- Topic Publish (using the SDK)
  ○ Create a topic
  ○ Create a subscription (or many)
  ○ Publish the topic
- Direct Publish (for mobile apps SDK)
  ○ Create a platform application
  ○ Create a platform endpoint
  ○ Publish to the platform endpoint
  ○ Works with Google GCM, Apple APNS, Amazon ADM
- Security:
  ○ Same as SQS
- Access Controls:
  Same as SQS; IAM policies to regulate access to the API, and SNS Access Policies

Fan Out Pattern:
- Application: S3 events to multiple queues
○ For the same combination of event type (e.g. object create) and prefix (e.g. images/) you can only have one S3 event rule, so you need a fan-out pattern to send an event to multiple queues
- SNS can also have FIFO topics:
○ This is useful if you can’t have duplication, or if ordering is important
○ Can only have SQS FIFO queues as subscribers
- Message filtering:
- JSON policy used to filter messages sent to SNS topic’s subscriptions
- If a sub doesn’t have a filter policy, it receives every message

Question 24

Q

Kinesis Data Streams

retention period
shards? hot / cold shards?
producers and consumers?
talk about data ordering for kinesis vs SQS FIFO

Answer

A

capture, process and store data streams

- A stream is made of shards. The more shards you have, the more throughput
- Like before, you have producers and consumers
- A record consists of a partition key and a data blob
- Billing is per shard provisioned, can have as many shards as you want
- Retention is 1 (default) to 365 days
- Ability to reprocess (replay) data
- Typically get 2MB per shard, but can pay extra ("enhanced fan-out") for 2MB per shard per consumer
- Once data is inserted into kinesis, it can't be deleted (immutability)
- Data that shares the same partition goes to the same shard (ordering)
- Producers: SDK, Kinesis Producer Library (KPL), Kinesis Agent
- Consumers:
	- Write your own: Kinesis Client Library (KCL), AWS SDK Managed: Lambda, Firehose, Kinesis Data Analytics

________________________________

- For SQS standard, there is no ordering
- For SQS FIFO, if you don't use a Group ID, messages are consumed in the order they are sent, with only one consumer
- Say you want to scale the number of consumers, but you want messages to be "grouped" when they are related to each other. You can use a Group ID (similar to a partition key in kinesis). The more Group ID we have, the more consumers we can have.
- Let's assume 100 trucks, 5 kinesis shards, 1 SQS FIFO:
	- Kinesis Data streams - On average you'll have 20 trucks per shard (because hashing). Trucks will have their data ordered within each shard. Maximum amount of consumers in parallel we can have is 5 (because we only have 5 shards). Because it's 1MB/s per shard, you get 5MB/s. SQS FIFO - You will only have one FIFO queue. You will have 100 group ID. You can up to 100 Consumers (due to the 100 Group ID). Maximum throughput is 300 messages per second (or 3000 if using batching).

Question 25

Q

Kinesis Data Firehose

use cases?
where is the data stored?
is it real-time?
failure / backup stuff?
retention period
compare streams vs firehose

Answer

A

load data streams into AWS data stores

- Fully managed service, no administration, automatic scaling, serverless
	- AWS: S3, Redshift, ElasticSearch
	- 3rd party providers (data dog, new relics, splunk, mongoDB)
	- Custom http destination
- Pay for data going through firehose
- Near realtime; 60 seconds latency minimum for non-full batches, or minimum 32MB data at a time
- Supports many data formats, conversions, transformations, compression
- Supports custom data transformations using AWS lambda
- Can send failed or all data to a backup S3 bucket

Streams vs Firehose:

- Streams are a streaming service for ingest at scale. Write custom code, real-time, and manage scaling yourself. Data storage for 1-365 days, and supports replay.
- Firehose is for loading streaming data into destinations. Fully managed, near real-time, auto-scaling, no data storage, no replay.

Question 26

Q

Kinesis Data Analytics

what is
use cases

Answer

A

analyze data streams with SQL or Apache Flink

- Real-time analytics on kinesis streams using SQL
- Fully managed, no servers to provision, automatic scaling
- Can create streams out of the real-time queries
- Use cases:
	- Time-series analytics
	- Real-time dashboards
           - Real-time metrics

Question 27

Q

Kinesis Video Streams

Answer

A

capture, process and store video streams

Question 28

Q

Amazon MQ

Answer

A

Managed Apache ActiveMQ
- Runs on dedicated machine, doesn’t scale as well as SQS/SNS, can run in HA with failover
- For exam, if you need to migrate existing infra that uses MQTT/AMQP (or others), then use Amazon MQ

Question 29

Q

Containers on AWS

- list the three options

Answer

A

ECS (amazon’s container platform)
- Fargate (amazon’s serverless container platform)
EKS (managed kubernetes)

Question 30

Q

ECS

What is it? Describe some of its features.
How does IAM work for ECS tasks?
Data Volumes?
How does load balancing work?
How does scaling work?
How do updates work?

Answer

A

Launch Docker containers on AWS
- You must provision and maintain the infrastructure (EC2 instances)
- AWS takes care of starting/stopping containers for you
  Has integrations with the Application Load Balancer

IAM Roles for ECS tasks:
- ECS Instance profile:
○ Used by the ECS agent
○ Makes API calls to ECS service
○ Send container logs to CloudWatch logs
○ Pull docker image from ECR
○ Reference sensitive data in Secrets Manager or SSM Parameter Store
- ECS Task Role:
○ Allow each task to have a specific role
○ Use different roles for the different ECS Services you run
Task role is defined in the task definition (task roles are common exam question)

ECS Data Volumes - EFS File Systems

- Works for both EC2 tasks and fargate tasks
- Ability to mount EFS volumes onto tasks
- Tasks launched in any AZ will be able to share the same data in the EFS volume
- Fargate + EFS = serverless + data storage without managing servers

ECS Services + Tasks
- Load balancing for EC2 launch type:
○ The ALB supports finding the right port on your EC2 instances (so you don’t need to specify it)
○ You must allow on the EC2 instance’s security group any port from the ALB security group
- Load balancing for Fargate:
○ Each task has a unique IP
○ You must allow on the ENI’s security group the task port from the ALB security group
- ECS tasks can be invoked by Event Bridge

ECS Scaling

- Just use CloudWatch alarms like usual
- CloudWatch metric (ECS Service CPU Usage)
- Optionally, scale ECS Capacity Providers (adds more EC2 instances if not using fargate)
- Could also scale on something like SQS Queue length

ECS Rolling Updates
- When updating from v1 to v2, we can control how many tasks can be started and stopped, and in which order
- This is based on min and max %; the min is how many must remain running, and the max is how many over the current number can run while moving over.
○ So if that’s 50/100, it will remove half your v1 and and swap with v2, then swap the remainder to v2.
If that’s 100/150, it will keep the existing v1 but add 50% more v2, before swapping the remainder over and returning to 100% total.

Question 31

Q

Fargate

Answer

A

Launch Docker containers on AWS
- You do not need to manage the infrastructure! (“serverless offering”)
  For every task you create, there is an ENI (elastic network interface) created to bind that task to a network IP.

Question 32

Q

ECR

Answer

A

Elastic Container Registry

- Store, manage and deploy containers on AWS, pay for what you use
- Fully integrated with ECS and IAM for security, backed by S3 Image vulnerability scanning, version, tag, image lifecycle

Question 33

Q

EKS

Answer

A

Managed kubernetes cluster. Useful if you are already using kubernetes and want to migrate to AWS using k8s.
EKS supports EC2 if you want to deploy nodes, or Fargate if you want to deploy serverless containers

Question 34

Q

Lambda

idk tell me about it
common service integrations
also tell me about limits (how long can it run, memory allocation, size)
eventsbridge?

Answer

A

This is all obvious
- Up to 10GB of RAM per function; increasing RAM will also improve CPU and network
- Easy monitoring with CloudWatch
- Support for a shitton of languages (essentially all with the custom runtime API)
- Lambda Container Image:
  ○ Container image must implement the lambda runtime api
  ○ ECS / Fargate is preferred over this for running arbitrary docker images
- Common service integrations:
  ○ API Gateway - defines a REST API
  ○ Kinesis - do data transformations
  ○ DynamoDB - create triggers for lambda events when a DB change occurs
  ○ S3 - e.g. if a file is created
  ○ CloudFront - lambda at edge
  ○ CloudWatch Events / EventsBridge - whenever things happen to our infrastructure and we want to be able to react
  ○ CloudWatch Logs
  ○ SQS / SNS
  ○ Cognito - e.g. react to user logging in to your database
- Pay per calls: $0.20 per 1 million requests
- Pay per duration: $1.00 for 600,000 GB-seconds
- Very cheap to run AWS Lambda

Lambda Limits (exam likes to test these)
- Per region
- Execution:
○ Memory allocation: 128 MB - 10 GB (64MB increments)
○ Maximum execution time: 900 seconds (15 minutes)
○ Env variables (4 kb)
○ Disk capacity in the “function container” (in /tmp): 512MB
○ Concurrency executions: 1000 (can be increased)
- Deployment:
○ Deployment size (compressed): 50MB
○ Uncompressed (code + dependencies): 250MB
4kb env vars

Question 35

Q

Lambda@Edge

what is it?
use cases

Answer

A

You’ve deployed a CDN using CloudFront
- What if you wanted to run a global AWS Lambda alongside?
- Or how to implement request filtering before reaching your application?
- You can use Lambda@Edge to deploy lambda functions alongside your CloudFront CDN
  ○ More responsive apps
  ○ Don’t manage servers
  ○ Customize the CDN content
  ○ Pay only what you use
- You can use Lambda to change CloudFront requests and responses:
  ○ After CloudFront receives a request from a viewer (viewer request)
  ○ Before CloudFront forwards the request to the origin (origin request)
  ○ After CF receives the response from the origin (origin response)
  ○ Before CloudFront forwards the response to the viewer (viewer response)
- You can also generate responses to viewers without ever sending the request to the origin
- Use cases:
  ○ Website security and privacy
  ○ Dynamic web application at the edge
  ○ SEO
  ○ Intelligently route across origins and data centers
  ○ Bot mitigation at the edge
  ○ Real-time image transformation
  ○ A/B testing
  ○ User authn/authz
  ○ User prioritization
  User tracking and analytics

Question 36

Q

DynamoDB

describe it
HA and scaling features?
how is throughput determined? (need to know)
what is DAX?
what are DynamoDB streams?
what is DynamoDB On Demand, and when is it useful?
backup and restore options?
what are global tables?
how can I migrate to DynamoDB?

Answer

A

Fully managed, HA with replication across 3 AZ
- NoSQL
- Scales to massive workloads, distributed database
- Fast and consistent performance
- Integrated with IAM for security, authorization and administration
- Enables event driven programming with DynamoDB Streams
- Low cost and auto scaling capabilities
- Basics:
  - DynamoDB is made of tables
  - Each table has a primary key (must be decided at creation time)
  - Each table can have an infinite number of items
  - Each item has attributes (can be added over time - can be null)
  - Maximum size of an item is 400kb
  - Data types supported are scalar types (string, number, bool, null), doc types (list, map), set types (string set number set etc)
- Provisioned Throughput (need to know)
  - Table must have provisioned read and write capacity units
  - Read Capacity Units (RCU): throughput for reads
    § 1 RCU = 1 strongly consistent read of 4kb per second
    § Or 2 eventually consistent reads of 4kb per second
  - Write Capacity Units (WCU): throughput for writes
    § 1 WCU = 1 write of 1KB per second
  - Option to setup auto-scaling of throughput to meet demand
  - Throughput can be exceeded temporarily using burst credit
    § You get an exception if these are consumed; do an exponential back-off retry
- DAX
  - DynamoDB Accelerator
  - Seamless cache for DynamoDB (no app rewrite)
  - Writes go through DAX to DynamoDB
  - Microsecond latency for cached reads and queries
  - Solves the Hot Key problem (too many reads)
  - 5 minute TTL for cache by default
  - Up to 10 nodes in the cluster
  - Multi AZ (3 node minimum recommended for prod)
- DynamoDB Streams
  - Changes in DynamoDB can end up in a DynamoDB Stream
  - Stream can be read by AWS lambda - react to changes in realtime, analytics, create derivative tables and views, insert into ElasticSearch
  - Could implement cross region replication using streams
  - 24 hour data retention
- Transactions
  - All or nothing operations
  - Up to 10 unique items
- On Demand
  - No capacity planning needed - scales automatically
  - 2.5x more expensive
  - Useful for unpredictable workloads or very low throughput
- Security
  - VPC endpoints available across DynamoDB without internet
  - Access fully controlled by IAM
  - Encryption at rest using KMS
  - Encryption in transit using SSL/TLS
- Backup and Restore
  - Point in time restore like RDS
  - No performance impact
- Global Tables
  - Multi region, fully replicated, high performance
  - Can enable cross region replication for low latency, disaster recovery (must enable DynamoDB streams)
- Amazon has database migration service (DMS) you can use to migrate an existing database to DynamoDB
  Can launch local DynamoDB on your computer for development purposes

Question 37

Q

API Gateway

tell me about it
features and use cases
name and describe the 3 endpoint types
name and describe the 3 ways of handling auth

Answer

A

API Gateway
- Support for websockets protocol
- Handles API versioning
- Multiple environments
- Create API keys, handle request throttling
- Import from OpenAPI doc
- Transform and validate requests and responses
- Generate SDK and API specs
- Cache API responses
- Endpoint types:
- Edge Optimized (default): for global clients
§ Requests are routed through CloudFront edge locations
§ API Gateway still only lives in one location
- Regional
§ For clients within the same region
§ Could manually combine with Cloudfront (more control over caching strategies and distribution)
- Private
§ Can only be accessed from your VPC

API Gateway Security
- The exam will test you on this
- IAM permissions
- Create an IAM policy authorization and attach to User / Role
- API Gateway verifies IAM permissions passed by the calling application
- Good to provide access within your own infrastructure
- Leverages “Sig v4” capability where IAM credentials are in headers
- Lamba Authorizer (formerly Custom Authorizers)
- Uses AWS Lambda to validate the token in header being passed
- Option to cache result of authentication
- Helps to use Oauth / SAML / 3rd party type of authentication
- Lamba must return an IAM policy for the user
- Pay per lamba invocation
- Cognito User Pools
- You manage your own user pool (can be backed by Facebook, Google login, etc)
- Cognito fully manages user lifecycle
- API gateway verifies identity automatically from AWS cognito
- No custom implementation required
- Cognito only helps with authentication, not authorization
Must implement your own authorization on the backend

Question 38

Q

AWS Cognito

whate are the three cognito tools and what do they do?

Answer

A

AWS Cognito

- We want to give our users an identity so that they can interact with our application
- Cognito User Pools:
	- Sign in functionality for app users
	- Integrate with API Gateway
- Cognito Identity Pools (Federated Identity)
	- Provide AWS credentials to users so they can access AWS resources directly
	- Integrate with Cognito User Pools as an identity provider
- Cognito Sync
	- Sync data from device to Cognito
	- Probably deprecated and replaced by AppSync (out of scope of exam)

Cognito User Pools

- Create a custom database of user for your mobile apps
- Simple login: Username (or email) / password combination
- Possibility to verify emails / phone numbers and add MFA
- Can enable Federated Identities (Facebook, Google, SAML)
- Sends back a JWT
- Can be integrated with API Gateway for authentication

Cognito - Federated Identity Pools

- Goal:
	- Provide direct access to AWS Resources from the client side
- How:
	- Log in to a federated identity provider - or remain anonymous
	- Get temporary AWS credentials back from the Federated Identity Pool
	- These credentials come with an pre-defined IAM policy stating their permissions
- Example
	- Provide temporary access to write to S3 bucket using Facebook login

Cognito Sync
- Deprecated now lol
- Store preferences, configuration, state of app
- Cross device synchronization
- Offline capability
Requires Federated Identity Pool in Cognito (not user pool)

Question 39

Q

AWS SAM

what is?

Answer

A

AWS SAM - serverless application model
- Framework for developing and deploying applications
- All the configuration is YAML code
- Lambda functions
- DynamoDB tables
- API Gateway
- Cognito User Pools
- SAM can help you run Lamba, API Gateway, DynamoDB locally
SAM can use CodeDeploy to deploy Lambda functions

Question 40

Q

Athena

Answer

A

Fully serverless database with SQL capabilities
Used to query data in S3
Pay per query
Output results back into S3
Secured through IAM
Use cases: one time SQL queries, serverless queries on S3, log analytics
○ For Solutions Architect:
○ Operations: no ops needed
○ Security: IAM + S3 security
○ Reliability: managed service, uses Presto engine, highly available
○ Performance: queries scale based on data size
Cost: pay per query / per TB of data scanned, serverless

Question 41

Q

Redshift

describe
how is data loaded?
leader node vs compute node?
HA options?
what are snapshot and DR options?
What is Redshift Spectrum?
Redshift vs Athena?

Answer

A

Based on Postgres, but not used for OLTP - it’s OLAP (online analytical processing, for data warehouses etc)
10x better performance than other data warehouses, scale to PBs of data
Columnar storage (not row based)
It’s highly performant because everything is executed in massive parallel. Has an SQL interface for performing queries.
Data is loaded from S3, DynamoDB, DMS, other DBs
From 1 node to 128 nodes, up to 160 GB of space per node
Leader node: for query planning, results aggregation
Compute node: for performing the queries, send results to leader
Redshift enhanced VPC routing: copy / unload goes through the VPC

Snapshots & DR:

Redshift has no Multi-AZ mode
Snapshots are point-in-time backups of a cluster
Snapshots are incremental
You can restore a snapshot into a new cluster
Automated: every 8 hours, every 5gb, or on a schedule. Set retention.
You can configure Redshift to automatically copy snapshots of a cluster to another AWS Region

RedShift Spectrum
○ Query data that is already in S3 without loading it
○ Must have a Redshift cluster available to start the query
○ The query is then submitted to thousands of Redshift Spectrum nodes

• Redshift
• For Solutions Architect:
○ Operations: like RDS
○ Security: IAM, VPC, KMS, SSL (like RDS)
○ Reliability: auto healing features, cross-region snapshot copy
○ Performance: 10x performance vs other data warehousing, compression
○ Cost: pay per node provisioned, 1/10th the cost vs other warehouses
Vs Athena: RS has faster queries / joins / aggregations thanks to indexes. Slower startup / creation time (minutes vs almost instant) but can do massively parallel queries

Question 42

Q

Glue

Answer

A

Managed ETL service
Useful to prepare and transform data for analytics
Fully serverless service
Use cases:
○ Can have Glue jobs execute when new data is available
○ Can create a unified catalogue for finding data across multiple data stores
○ Create, run and monitor ETL jobs without coding
○ Data exploration

Question 43

Q

Neptune

what is it?
reliability?

Answer

A

It’s a (fully managed) graph database lol
When to use graphs:
○ High relationship data
○ Social networking
○ Knowledge graphs (wikipedia)
HA across 3 AZ, up to 15 read reps
Point-in-time recovery, continuous backup to s3
Support for KMS at rest + HTTPS
For Solutions Architect:
○ Operations: similar to RDS
○ Security: IAM, VPC, KMS, SSL (similar to RDS) + IAM auth
○ Reliability: Multi-AZ, clustering
○ Performance: best suited for graphs, clustering to improve performance
○ Cost: pay per node provisioned (similar to RDS)

Question 44

Q

ElasticSearch

what is it?
reliability?

Answer

A

In ES you can search any field, even partial matches
It’s common to use ES as a complement to another database
ES also has some usage for Big Data applications
You can provision a cluster of instances
Built-in integrations with kinesis firehose, AWS IoT, and cloudwatch logs
For Solutions Architect:
○ Operations: the same
○ Security: Cognito, IAM, VPC, KMS, SSL
○ Reliability: Multi-AZ, clustering
○ Performance: based on ES project (open source), petabyte scale
○ Cost: pay per node provisioned (similar to RDS)

Question 45

Q

STS

Answer

A

Security Token Service
- Allows to grant limited and temporary access to AWS resources
- Token is valid for up to one hour
- AssumeRole
○ Within your own account: for enhanced security
○ Cross Account Access: assume role in target account to perform actions there
- AssumeRoleWithSAML
○ Return credentials with users logged with SAML
- AssumeRoleWithWebIdentity
○ Returns creds for users logged with an identity provider (fb login, google login…)
○ AWS recommends against this, and using cognito instead
- GetSessionToken
○ For MFA, from a user or AWS account root user
- Using STS to assume a role:
○ Define an IAM role within your account or cross-account
○ Defined which principals can access this IAM role
○ Use STS to retrieve credentials and impersonate the IAM role you have access to
○ Temporary credentials can be valid between 15 min
When you see assuming roles, cross-account access on the exam, think STS

Question 46

Q

Identity Federation

Answer

A

Identity Federation
- Federation lets users outside of AWS to assume temporary role for accessing AWS resources.
- These users assume identity provided access role.
- Federations can have many flavors:
○ SAML 2.0
○ Custom identity broker
○ Web identity federation with or without Amazon Cognito
○ SSO
○ Non-SAML with AWS Microsoft AD
- Using federation, you don’t need to create IAM users (user management is outside of AWS)
- SAML 2.0 Federation
○ Very common identity federation
○ Used to integrate active directory / ADFS with AWS (or any SAML 2.0)
○ Provide access to AWS console or CLI (through temporary creds)
○ No need to create an IAM user for each of your employees
○ About SAML 2.0-based federation - AWS Identity and Access Management (amazon.com)
○ Enabling SAML 2.0 federated users to access the AWS Management Console - AWS Identity and Access Management (amazon.com)
○ Needs to setup a trust between AWS IAM and SAML (both ways)
○ SAML 2.0 enables web-based, cross domain SSO
○ Uses the STS API: AssumeRoleWithSAML
○ Note federation through SAML is the “old way” of doing things
○ Amazon Single Sing On (SSO) Federation is the new managed and simpler way
- Custom Identity Broker Application
○ Use only if identity provider is not compatible with SAML 2.0
○ The identity broker must determine the appropriate IAM policy
○ Uses the STS API: AssumeRole or GetFederationToken
- Web Identity Federation - AssumeRoleWithWebIdentity
Not recommended by AWS - use cognito instead (allows for anonymous users, data synchronization, MFA)

Question 47

Q

Directory Services

What are the three ones, and what are they used for?

Answer

A

AWS Managed Microsoft AD
○ Create your own AD in AWS, manage users locally, supports MFA
○ Establish “trust” connections with your on-premise AD
- AD Connector
  ○ Directory Gateway (proxy) to redirect to on-premise AD
  ○ Users are managed on the on-premise AD
- Simple AD
  ○ AD-compatible managed directory on AWS
  ○ Cannot be joined with on-premise AD
- Exam will ask a high-level question about these services
  What is AWS Directory Service? - AWS Directory Service (amazon.com) has a great table explaining advantages/disadvantages

Question 48

Q

Organizations

talk about it
what are common multi-account strategies?
what is a service control policy?
how do I move an account from one account to another? What if I want to move the master account?

Answer

A

AWS Organizations
- Global service
- Allows to manage multiple AWS accounts
- The main account is the master - you can’t change it
- Other accounts are member accounts
- Member accounts can only be part of one organization
- Consolidated Billing across all accounts - single payment method
- Pricing benefits from aggregated usage (volume discount for EC2, S3…)
- Can automate account creation with API
- Multi Account Strategies
○ Create accounts per department, per cost center, per dev / test / prod, based on regulatory restrictions (using SCP), for better resource isolation (ex: VPC),
○ Multi Account vs One Account MultiVPC
○ Use tagging standards for billing purposes
○ Enable CloudTrail on all accounts, send logs to central S3 account
○ Send CW logs to central logging account
○ Establish Cross Account Roles for Admin purposes
- Service Control Policies
○ Whitelist or blacklist IAM actions
○ Applied at the OU or Account level
○ Does not apply to the Master Account
○ SCP is applied to all the Users and Roles of the Account, including Root
○ The SCP does not affect service-linked roles
§ Service-linked roles enable other AWS services to integrate with AWS Organizations and can’t be restricted by SCPs
○ SCP must have an explicit Allow (does not allow anything by default)
○ Use cases (exam will test)
§ Restrict access to certain services (for example: can’t use EMR)
§ Enforce PCI compliance by explicitly disabling services
- Moving Accounts (exam)
○ To migrate an account from one org to another: remove it from old org, send invite to new org, accept invite with that account
To move master account to new org: remove all member accounts from the organization, delete the old org, repeat process above

Question 49

Q

More IAM

compare IAM Roles vs Resource Based Policies
what are IAM conditions?
what are IAM permission boundaries?
how does policy evaluation logic work?

Answer

A

IAM - Advanced
IAM Roles vs Resource Based Policies:
- when you assume a role, you give up your original permissions and take the permissions assigned to the role
- When you use a resource based policy, the principal doesn’t have to give up his permissions
- Supported by S3 buckets, SNS topics, SQS queues
IAM Conditions
○ logic-based rules around access.
○ Know aws:SourceIP (where the client is located) and Aws:RequestedRegion (where you’re trying to make the request to)
○ In buckets, you’d see “arn:aws:s3:::test” as a resource for bucket actions (bucket level permission), but “arn:aws:s3:::test/*” for actions on bucket contents, like GetObject/PutObject (object level permission)
IAM Permission Boundaries
- Permission Boundariess are supported for users and roles, not groups
- Advanced feature to use a managed policy to set the maximum permissions an IAM entity can get.
- Can be used in combinations of AWS Organizations SCP
- Use cases:
§ Delegate responsibilities to non administrators within their permission boundaries, for example create new IAM users
§ Allow devs to self-assign policies and manage their own permissions, while making sure they can’t “escalate” their privileges (make themselves admin)
§ Useful to restrict one user instead of whole account using orgs / scp

IAM - Policy Evaluation Logic
- As soon as there’s an explicit deny, it’s denied, regardless of everything else.
If it’s not explicitly allowed, it’s denied.

Question 50

Q

RAM

what is it?
use cases

Answer

A

Share AWS resources that you own with other AWS accounts
- Share with any account or within your org
- Avoid resource duplication
- VPC subnets: (common in exam)
  - Allow to have all the resources launched in the same subnets
  - Must be from the same AWS organizations
  - Cannot share sec groups and default VPC
  - Participants can manage their own resources in there
  - Can’t view, modify, delete resources that belong to other participants or the owner
- AWS transit gateway
- R53 resolver rules
- License manager configurations

IMO it’s intended to share resources across accounts within an organization. It’s particularly useful for sharing resources within the same VPC subnet.

Question 51

Q

SSO

Answer

A

Centrally manage SSO to access multiple accounts and 3rd party business applications
- Integrated with AWS organizations
- One login (your AWS account) for all integrated accounts
- Supports SAML 2.0 for markup
- Integration with on-premise Active Directory
- Centralized permission management
- Centralized auditing with CloudTrail
  You could also use AssumeRoleWithSAML, but needs to be configured for every user, maybe manage 3rd party login portal. SSO is way easier.

Question 52

Q

KMS

what is it
how can you audit key usage
what are keys bound to?
key policies
how do you copy a kms-encrypted snapshot across accounts?
key rotation options
key rotation best practices

Answer

A

Any time you hear “encryption” for an AWS service, it’s most likely KMS
- Easy way to control access to your data, AWS manages keys for us
- Seamlessly integrated into:
  ○ EBS
  ○ S3
  ○ Redshift
  ○ RDS
  ○ SSM: parameter store
  ○ And more
- But you can also use the CLI / SDK
- Customer Master Key (CMK) Types:
  ○ Symmetric (AES-256)
  § First offering of KMS, single encryption key that is used to encrypt and decrypt
  § AWS services that are integrated with KMS use Symmetric CMKs
  § Necessary for envelope encryption
  § You never get access to the key unencrypted (must call KMS API to use)
  ○ Asymmetric (RSA & ECC key pairs)
  § Public (encrypt) and private (decrypt) pair
  § Used for encrypt/decrypt, or sign/verify operations
  § The public key is downloadable, but you can’t access the private key unencrypted
  § Use case: encryption outside of AWS by users who can’t call the KMS API
- Audit key usage with CloudTrail
- Three types of Customer Master Keys (CMK):
  ○ AWS managed service default cmk: free
  ○ User keys created in KMS: $1/month
  ○ Use keys imported (must be 256-bit symmetric key): $1/month
- - pay for API call to KMS ($0.03 / 10000 calls)
- To give KMS access to someone:
  ○ Make sure the Key Policy allows the user
  ○ Make sure the IAM policy allows the API calls
- KMS keys are bound to a region
  ○ If you copy an EBS snapshot to a new region, it needs to be reencrypted
- Key Policies:
  ○ Control access to KMS keys, similar to S3 bucket policies
  ○ Difference: you cannot control access without them
  ○ Default KMS Key Policy:
  § Created if you don’t provide a specific KMS policy
  § Complete access to the key of the root user = entire AWS account
  § Gives access to the IAM policies to the KMS key
  ○ Custom KMS Key Policy:
  § Define users, roles that can access the KMS key
  § Define who can administer the key
  § Useful for cross-account access
- Copying Snapshots across accounts:
  ○ Create a snapshot, encrypted with your own CMK
  ○ Attach a KMS Key Policy to authorize cross-account access
  ○ Share the encrypted snapshot
  ○ (in target) Create a copy of the snapshot, encrypt it with a KMS key in your account
  ○ Create a volume from the snapshot

KMS - Key Rotation
- Automatic Key Rotation:
○ For customer-managed CMK (not AWS managed CMK)
○ If enabled: automatic key rotation happens every 1 year
○ Previous key is kept active so you can decrypt old data
○ New Key has the same CMK ID (only the backing key is changed)
- Manual Key Rotation:
○ When you want to rotate keys every 90 days, 180 days, etc
○ New key has a different CMK ID
○ Keep the previous key active so you can decrypt old data
○ Better to use aliases in this case (to hide the change of key for the application)
○ Good solution to rotate CMK that are not eligible for automatic rotation (like asymmetric CMK)
- Alias Key Updating
○ Using an alias is way nicer for manual key rotation
○ Use an alias for your key within your app so that it’s easy to update a key; you can just do an UpdateAlias API call and it’s done.
- For exam:
Need to know: for automatic key rotations, it’s every year. For manual it’s 90 or 180 days.

Question 53

Q

SSM Parameter Store

what is it?

Answer

A

Secure storage for configuration and secrets
- Optional Seamless Encryption using KMS
- Serverless, scalable, durable, easy SDK
- I think this is basically etcd but for aws
- Versioning
- Configuration management using path & IAM
- Notifications with CloudWatch events
- Integration with CloudFormation
- Path-based storage and GETing
- There’s some configuration stuff that’s provided by aws (like they have an /aws/service/ami-amazon-linux-latest to tell you what the latest AMI is for amazon linux)
  SSM Parameter Store is free initially, with an optional advanced tier with extra benefits (like TTL and stuff). You won’t be asked to choose between standard and advanced in the exam.

Question 54

Q

Secrets Manager

what is it
compare vs parameter store
what’s the big thing it integrates with?
secrets rotation options?

Answer

A

Newer service, came out after SSM Parameter Store. Meant for storing secrets
- Capability to force rotation of secrets every X days
- Automate generation of secrets on rotation (uses lambda)
- Integration with Amazon RDS
- Secrets are encrypted using KMS
- Mostly meant for RDS integration
  Any time you see integration with RDS, secrets storing, secrets rotation, think SM

Question 55

Q

CloudHSM

what does “hsm” mean?
describe it

Answer

A

With KMS, AWS manages the software for encryption.
- CloudHSM = AWS provisions the encryption hardware
- HSM is hardware security module
- You manage your own encryption keys by yourself. AWS has no access and can’t even recover them.
- HSM device is tamper resistant, crazy high compliance standards.
- Supports both symmetric and asymmetric encryption (asym is useful for for SSL/TLS keys)
- No free tier
- Must use the CloudHSM Client Software (you manage the keys/users with this)
- RedShift supports CloudHSM for database encryption and key management.
- Good option to use with SSE-C encryption.

Question 56

Q

AWS Shield

how does it protect you?

Answer

A

Free service that is activated for every AWS customer
- Works for every AWS service apparently
  Provides protection from SYN/UDP floods, reflection attacks and other layer 3/4 attacks.

Question 57

Q

Web Application Firewall

what can you deploy it on? (know for exam)
what attacks does it protect against?

Answer

A

Protect your web applications from common layer 7 attacks
- Can deploy on ALB, API Gateway, CloudFront (know these for exam)
- To use it, you need to define a Web ACL (web access control list)
  ○ Rules can include IP addresses, HTTP headers, HTTP body, URI strings
  ○ Protects from common attacks (SQLi or XSS - will be tested on these)
  ○ Size constraints, geo-match (block countries)
  Rate-based rules (to count occurrences of events) - for simple DDOS protection

Question 58

Q

AWS Firewall Manager

what is it?
what are three services you can manage with this?

Answer

A

Manage rules in all accounts of an AWS Organization
- Common set of security rules
- WAF rules (ALB, API Gateways, CloudFront)
- AWS Shield Advanced (ALB, CLB, Elastic IP, CloudFront)
  - Security Groups for EC2 and ENI resources in VPC

Question 59

Q

GuardDuty

describe it
what unique attack can it protect against?

Answer

A

Intelligent Threat discovery to protect your aws account
- Uses ML algorithms, anomaly detection, 3rd party data
- One click to enable (30 day trial)
- Input data includes:
  ○ CloudTrail logs
  ○ VPC flow logs
  ○ DNS logs
- Can set up a CW Event rule to be notified in case of findings
- CWE rules can target lambda or SNS
  Can protect against CryptoCurrency attacks (has dedicated logic for this)

Question 60

Q

Inspector

describe it
where does it run?

Answer

A

Automated security assessments for EC2 instances
- Runs within your instances (exam); analyzes the running OS against known vulnerabilities.
- Analyze against unintended network accessibility
- AWS inspector agent must be installed on OS in EC2 instances
- After the assessment, you get a report with a list of vulnerabilities.
  Can get SNS notifications.

Question 61

Q

Macie

Answer

A

Is a fully managed data security and data privacy service that using ML and pattern matching to discover and protect sensitive data in AWS.
Useful for personally identifiable information.

Question 62

Q

VPC

what is the default VPC?
how many addresses can you have per VPC?
how many addresses does AWS reserve in a subnet?

Answer

A

Default VPC

- All new accounts have a default VPC
- New instances are launched into the default VPC if no subnet is specified
- Default VPC have internet connectivity and all instances have a public IP
- We also  get a public and private DNS name

VPC in AWS - IPv4
- You can have multiple VPCs in a region (soft limit is 5 per region)
- Max CIDR per VPC is 5. For each CIDR:
○ Min size is /28 = 16 addresses
○ Max is /16 = 65536 addresses
- Because VPC is private, only the private IP ranges are allowed
- Your VPC CIDR should not overlap with your other networks (ex: corporate)
- A VPC can span multiple AZ

Subnets - IPv4
- Each AZ within a VPC can have multiple subnets
- A subnet must side entirely within one AZ
- When you create a subnet, you define a CIDR block for it (which is a subset of the CIDR block of the VPC)
AWS reserves 5 IP addresses (first 4 and last 1 IP address) in each Subnet. It’s a common exam question to ask what subnet you need for x IP, and try to trick you if you don’t know that AWS reserves 5 IP for itself.

Question 63

Q

Internet Gateways

what does it do? How does it need to be configured to do that?
how many VPC can an IG attach to, and vice versa?

Answer

A

Internet gateways help our HPC instances connect to the internet
- It scales horizontally and is HA and redundant
- Must be created separately from the VPC
- Common exam question: each internet gateway can only be attached to one VPC (and vice versa)
- Internet Gateway is also a NAT for the instances that have a public IPv4
- On their own, IGs do not allow internet access. You must edit the route table as well.
  It’s a good idea to have a separate route table for each subnet, instead of using the main route table.

Question 64

Q

Nat Instances

what does it do?
which subnet should it be launched in?

Answer

A

NAT Instances - Network Address Translation
- Outdated but still on the exam :(
- Allows instances in the private subnets to connect to the internet
- Must be launched in a public subnet
○ Basically, the private subnet connects to the NAT in the public subnet, which is already configured to access public internet.
- Must disable EC2 flag: source / destination check
- Must have elastic IP attached to it
- Route table must be configured to route traffic from private subnets to a NAT instance
- My thoughts:
Honestly NAT sucks because it’s a single point of failure and has a lot of arbitrary nonsense you need to do. NAT gateways are way better

Answer 65

A

AWS managed NAT, higher bandwidth, better availability, no admin
- Pay per hour for usage and bandwidth
- NAT is created in a specific AZ, uses an EIP
- Cannot be used by an instance in that subnet (only from other subnets)
- Requires an internet gateway
  ○ Private subnet -> NAT -> IGW
- 5 Gbps bandwidth, autoscales up to 45. In NAT instances they’re limited to the EC2 instance so this is way better.
- Don’t need a sec group.
- Use case for this (and instances) is that you can give internet to a private subnet, while preventing traffic from coming in
- HA:
  ○ Resilient within a single AZ
  ○ If you want better fault tolerance, you need multiple NAT gateways in multiple AZ.

Answer 66

A

NACL are like a firewall which control traffic from and to a subnet
- Default NACL allows everything outbound and everything inbound
- One NACL per subnet, new Subnets are assigned to Default NACL
- If you create a new NACL, it will deny everything
  - NACL are a great way of blocking a specific IP at the subnet level

NACLs and Security Groups
- Need to know difference for exam
- The NACL is placed before traffic enters into the VPC (where an EC2 instance with a security group is)
- Security groups are stateful, so if an inbound request passes, the outbound will as well
- The NACL is stateless, so the outbound rule is always evaluated
- NACL evaluates rules in order, SG evaluates all rules
- SG only applies if it’s assigned to an instance, NACL automatically applies to all instances in the subnet it’s associated with
SG supports allow rules only, NACL is both allow and deny

Ephemeral Ports
- Typically a port range for amazon linux kernel is 32768-61000
- The client that initiates the request chooses the ephemeral port range.
- If an instance in your VPC is the client initiating the request, your NACL must have an inbound rule to enable traffic destined for the ephemeral ports specific to the type of instance (Amazon Linux, Windows Server 2008, etc)
In practice, to cover the different types of clients that might initiate traffic to public-facing instances in your VPC, you can open ephemeral ports 1024-65535.

Answer 67

A

Connect two VPC, privately using AWS’ network
- Make them behave as if they were in the same network
- Must not have overlapping CIDR
- VPC Peering is not transitive
- You must update route tables in each VPC’s subnets to ensure instances can communicate
- VPC peering can work inter-region, cross-account
  You can reference a security group of a peered VPC (works cross account)

Answer 68

A

Allows you to connect to AWS services using a private network instead of public internet
- They scale horizontally and are redundant
- They remove the need of IGW, NAT, etc… to access AWS services
- Interface: provisions an ENI (private IP address) as an entry point (must attach security group) - most AWS services
- Gateway load banacer: provisions a target and must be used in a route table
  - S3 and DynamoDB (idk why but this is tested)
- In case of issues:
  ○ Check DNS Setting Resolution in your VPC
  Check Route Tables (so it doesn’t get superseded by a gateway)

Answer 69

A

Flow logs capture information about IP traffic going to your interfaces:
○ VPC flow log
○ Subnet flow log
○ ENI flow log
- Flow logs data can go directly into S3 / Cloudwatch logs
  ○ Obviously can use athena for querying them on S3
- Captures network information from AWS managed interfaces like ELB, RDS, ElastiCache, RedShift, Workspaces
  Need to know how to read a flow log

Answer 70

A

In general, a bastion host is a service that’s publicly accessible in order to act as a “gate” to the rest of the system. The idea is to keep it extremely secure, have a minimal footprint, and do frequent security audits.
In the context of AWS, it’s an instance on the public subnet that allows you to ssh into instances on the private subnet. So you only want to have port 22 open to your IP (for SSH)

Answer 71

A

Allows you to connect to your VPC as a VPN connection.
- You need to create a customer gateway (software or physical device) on-prem, and set up a virtual private gateway on your VPC, and then the connection made between those two is a site to site VPN
- For the customer gateway’s IP Address:
  ○ Use static, internet-routable IP address for your customer gateway device
  ○ If CGW is behind a NAT, use the public IP address of the NAT

Answer 72

A

Elastic Network Adapter. Basically next-gen ENI for HPC. Higher bandwidth, higher pps (packets per second), lower latency

Answer 73

A

Elastic Fabric Adapter.

This is a souped up version of the ENA (elastic network adapter), specifically designed for HPC. Great for inter-node communications, tightly coupled workloads.

Bypasses the underlying AS to provide low-latency, reliable transport.

Answer 74

A

Open source cluster management tool for deploying HPC to AWS

Answer 75

A

supports multi-node parallel jobs - you can run single jobs that span multiple EC2 instances