AWS Services Flashcards
Amazon Athena
works with what service?
use case?
Serverless Query service for S3 using SQL
Point it at your S3 buckets, define a schema, and search using SQL
Pay per query.
Don’t need to ETL.
Used for analytics. Cannot write.
Amazon Cloudsearch
Search Solution for Websites and Apps
Think adding a search bar to your website so users can search whatever you have there.
Features: Free text, Boolean, and Faceted search Autocomplete suggestions Customizable relevance ranking and query-time rank expressions Field weighting Geospatial search Highlighting Support for 34 languages
Amazon Elastic Search
Deploy and run ElasticSearch as a fully managed service, rather than buying all the infrastructure and licensing yourself.
Includes storing data, running searches, and hosting dashboards.
Does allow multi AZ configs
Amazon EMR
EMR= Elastic MapReduce.
Big Data Platform and Analysis.
Basically, this is managed Hadoop.
Good for unstructured data!
Amazon Kinesis
Shard, Sequence Number, Partition Key, Firehose
Real time streaming data capture and analysis
Mnemonic: Kinesis = movement, like a stream: streaming data
Has Shards, Has Firehose. Cannot EBS as it does not use EC2 instances.
There is a stream (say, a stream from a physical security system). That stream will use partition keys as a way to index records (for example, a partition key for heat sensor, and for camera). Those partition keys are used to separate those records into Shards. Each record, has a sequence number created that is unique per shard (i.e. each shard as Sequence Number 1, 2, 3, 4, 5, etc).
Firehose sends streaming data to AWS storage services.
Amazon Redshift
use case?
How to query?
Used for analytics, not transactions. Managed EC2 Data Warehouse Service. Uses SQL for queries Good for structured data AQUA = ADVANCED Query accelerator Spectrum = run huge queries quickly
Amazon QuickSight
Serverless ML BI Dashboards
Is fully managed, serverless, and flexible.
Offers a pay-per-use model
Can ask BI questions in natural language.
Redshift caches repeat queries
Amazon Data Exchange
Subscribe to 3rd Party Data Sets like: Square (the company) location of transaction data Weather data Stock data COVID data IMDB movie data
Amazon Data Pipeline
Transfer data on the AWS cloud by defining, scheduling, and automating each of the tasks.
Is a managed ETL (Extract-Transform-Load) service
Is NOT serverless; it manages the creation of EC2 and EMR instances to do work, and you pay for those.
Amazon Glue
Data discovery, enrichment and transfer
Is an ETL (Extract-transform-load) tool like Data Pipeline.
Is Serverless
Supports S3, RDS, Redshift, SQL, and DynamoDB
AWS Lake Formation
Set up Data Lakes quickly
Organizes data in S3, and sizes chunks for efficiency
Does some deduplication and normalization automatically
AWS Step Functions
Serverless Function Orchestration
Low-Code, visual workflow service for orchestrating AWS services
Amazon AppFlow
Integrate 3rd party app data
fully managed integration service to securely transfer data between Software-as-a-Service (SaaS) applications like Salesforce, Zendesk, Slack, and ServiceNow, and AWS services like Amazon S3 and Amazon Redshift,
Amazon EventBridge
Serverless Event Bus
Takes in events from 3rd party sources, and sends them to AWS services like Lambda
Amazon MQ
Diff from SQS?
Message Broker Service for Apache/Rabbit MQ
Should be used for legacy MQ apps migrating to AWS. All new apps should use SQS.
Amazon SNS
Push or Pull-based?
Simple Notification Messaging System
Push-based: it pushes out messages. this means it can do stuff like invoke a Lambda function.
Amazon SQS
Is it pull-based or push-based.
max retention?
Simple Queue Service Inter Component Messaging
Enables Loose coupling of applications (fewer interdependencies)
It is pull-based. It is purely reactive: things get put into it and get pulled out of it, passively. This means that it can’t do stuff like invoke a Lambda function.
Max retention = 14 days
Max Message Size = 256KB
Has many queue types (e.g. FIFO, delay, dead-letter, temporary)
Can have duplicates unless FIFO
Amazon AppSync
What do?
How managed?
GraphQL API Service
GraphQL is an API Query language that lets applications have a single endpoint to serve API requests for lots of kinds of stuff. EX: One API endpoint to get data from your RDS database, Lambda functions, and S3 storage.
https://graphql.org/
Is a fully managed service.
AWS Cost Explorer
Visualize and manage AWS costs
Allows costs to be grouped by all kinds of attributes, including custom “tags”
AWS Budgets
Service to set and monitor both cost and usage budgets
AWS Cost and Usage Report
reporting to analyse AWS usage
Amazon Managed Blockchain
Hyperledger & Ethereum Service
Quantum Ledger DB (QLDB)
Fully managed financial ledger db
Amazon EC2
Instance types: Spot, Scheduled Reserved instances, Reserved instances, On-Demand Instances
Spread/Partition/Cluster partition groups?
Secure, resizable Compute Instances (400+ options)
Spot: Only available when there is extra capacity. User must be flexible when they can be used.
Scheduled Reserved Instances: pick times to have reserved capacity
Reserved Instances: Reserve 24/7
On-Demand Instances: Pay-as-you-go. Flexible.
Partition Groups:
Cluster - as close together as possible in the AZ for performance
Partition - Separate nodes across hardware within the AZ
Spread - Same as partition, but limited to one server per group
EC2 Autoscaling
Types?
What’s a target tracking policy?
Automated compute capacity scaling
There is a cool-down timer
can be triggerd by lots of stuff
types= Simple, Step, Scheduled
Target tracking policy autoscales to keep a
Amazon LightSail
Easy virtual private server instances
Elastic Beanstalk
Deploy & scale web apps (Java/Ruby/etc)
AWS Lambda
Max execution time
Serverless Compute Functions
max runtime= 900 seconds (15 min)
Works well with API Gateway
ECR
Elastic Container Registry
ECS
How do Roles and Definitions work
Elastic Container Service to deploy/manage clusters & tasks
A definition definition specifies what docker image to use, and a bunch of other parameters, including Task Role
A Task role is an IAM role
Can only apply one role per task definition
EK
Elastic Kubernetes Service
AWS Copilot
CLI to launch and manage containers
AWS Fargate
Serverless Compute Engine for ECS/EKS Containers
Runs containerized applications for you, without your needing to pay for or plan for infrastructure.
Only supports container images from ECR and Docker Hub
Amazon Aurora
Replication?
Multi-region?
Fully Managed?
Define a size at startup?
MySQL and PostgreSQL compatible database service
Replicates up to 5 “Read replicas” in diff regions
Replicates up to 15 performance replicas in same region “Cluster”
CAN be multi-region, but not multi-master
IS fully managed
No upfront size configuration
Auto-scaling of storage
Aurora replicas are by default targets for reads, and standby for the primary.
DynamoDB
What is Streams?
Cost models?
DAX?
KeyValue / Document Database / NoSQL
Global Tables = enable multi-region, multi-master database without configuring your own replication
Streams = list of item level changes. 24hr lifespan. integrates with lambda
Supports gateway endpoints but not interface endpoints?
Provisioned capacity: specified read/writes
On-demand capacity: charged based on how many reads/writes happen
Push-button Scaling
DAX = fully manged, highly available, in-memory cache
Amazon Elasticache
redis
memcached
Scalable in-memory database
in memory caching provides low latency throughput
redis is persistent, feature rich, can do replication and high availability
memcached is not persistent, simple, not HA.
Neptune
Graph database for highly connected data sets
Amazon RDS
Relational Database (MySQL/Postgres/Maria etc)
Timestream
Serverless time series db for IoT
AWS VPC
logically isolated virtual private clouds
API Gateway
Create and manage APIs
AWS CloudFront
What is it?
Why use it?
Services it integrates with?
Fast content delivery network (CDN) service
Brings content closer to users for speed.
Can be used with S3, EC2, ELBs
Amazon Route53
fast DNS service
AWS PrivateLink
AKA?
first step?
types?
Allows other AWS accounts to access your VPC.
aka VPC endpoint services
First step is to set up an NLB
Interface endpoints: simple way to allow traffic into your VPC
Gateway Load Balancer endpoint: Allow traffic in AND load balance that traffic
Gateway Endpoints: Allow access to S3 and DynamoDB
NOT A TRANSIT GATEWAY
NOT A INTERNET GATEWAY
AWS App Mesh
Service Mesh for inter compute instance comms.
You can use App Mesh with AWS Fargate, Amazon EC2, Amazon ECS, Amazon EKS, and Kubernetes running on AWS, to better run your application at scale.
AWS CloudMap
Resource discovery for app usage
AWS Direct Connect
Fast connection from your equipment (on-prem) to AWS.
Takes a long time to set up.
AWS Global Accelerator
Compatible targets?
User traffic routing over the AWS network, rather than the internet.
uses anycast to enable failover without changing the public IP Address
Only works for EC2 and ELBs
With Global Accelerator, you are provided two global static public IPs that act as a fixed entry point to your application, improving availability. On the back end, add or remove your AWS application endpoints, such as Application Load Balancers, Network Load Balancers, EC2 Instances, and Elastic IPs without making user-facing changes. Global Accelerator automatically re-routes your traffic to your nearest healthy available endpoint to mitigate endpoint failure.
AWS Transit Gateway
Diff: Direct Connect Gateway, Transit Gateway Connect, and peered Transit Gateway
Centralized VPC and on prem connectivity
???????wtf
Elastic Load Balancing
Service to evenly distribute network traffic.
Can only target one “target type”, such as IP address, Instance ID, or Lambda.
Can work with on-premise using IP Address.
Amazon S3
How accessed by other services? Tiers: standard, S3-IT, S3-IA, S3-one zone IA, Glacier, Glacier deep archive, Outposts Minimum data charges? Minimum Storage Durations? Default AZ replication? What is Transfer Acceleration? Static content over HTTPS?
Object storage service
Enables lifecycle management for data
Generally accessed via API
Tiers:
Standard: relatively high cost and fast. no cost to retrieve. Min 3 AZs in one region.
intelligent tiering: moves data to storage classes based on usage patterns
Infrequent Access: Begin paying retrieval fees, has minimum data size charge. Fast to access if needed, but cheaper if not accessed. 128kb min charge. 30 days min storage.
One-zone Infrequent Access: limited to one AZ.
Glacier: Minutes to hours to retrieve. 40 kb min charge. 90 days min storage charge.
Glacier deep archive: 12 hours retrieve. Min storage 180 days.
Outposts: S3 but on your premises. Use same APIs to access.
S3 static content can only be accessed on HTTPS
Transfer Acceleration improves UPLOADS to S3 using Cloudfront
AWS EBS
Multi-AZ? Region?
modes?
Elastic Block Store, Persistent block
Is like a disk partition mounted to the OS.
PIOPS = higher IOPS
Gen Purpose
General Purpose SSD (gp2) = up to 16,000 IOPS, 250 MB/s. gp3 has 1000 MB/s throughput.
io1/io2 if you need more.
AWS EFS
Is a version of what general tech? Modes? Scale? Multi-AZ? Region? Supported OSs?
Serverless Elastic File System
Is a version of an NFS (network file system)
EFS allows EC2 or ECS to use standard file system IO like with EBS, but is scalable to peta-bytes
Can be attached to several EC2 instances. Allows simultaneous access.
Max I/O mode = higher latency, but enables higher throughput. Is opposed to General purpose mode.
Supports multi-AZ, but not multi-regions, except by using Datasync
Does not support Windows EC2 instances
FSx for Lustre
High performance file storage using Lustre
Used for HPC and is highly parallel and distributed.
FSx for Windows File System
AWS Windows file system
AWS Backup
Policy driven data protection
AWS Snow
Edge infrastructure for storage and compute
AWS Storage Gateway
File Gateway
Hybrid on prem AWS storage
AWS gives you a storage appliance, basically.
Also available as a VM.
Store your data on-premise, but managed by AWS
Is low-latency, compared to web storage
File Gateways can be mounted to instances, but back up to S3
CloudEndure
Disaster recovery service
AWS DMS (Database Migration Service)
What migration targets are supported?
RDS: Aurora, MySQL, PostgreSQL, MariaDB, MS SQL Server, Oracle
Also supports non RDS targets: NoSQL
Works for on-Premise and Cloud databases, as long as you’re ending up on the cloud.
AWS Datasync
A tool to simplify moving data.
Over the internet.
Can move data, or maintain copies in the cloud.
AWS Batch
Batch computing service.
Dynamically provisions EC2 instances to do the operations.
AWS Resource Access Manager
Part of AWS Organizations
Enables sharing resources with other AWS accounts or organizations
Read Replicas in RDS
Creates read-only replica of a database, often in another region for availability.
Read-Replica limits in RDS are: 5
Cannot create an encrypted replica from an unencrypted master.
Read replicas in different regions from the master use a different encryption key.
Elastic Fabric Adapter
Version of an ENA, but with more support for faster networking.
Has OS bypass support for lower latency
Elastic Network Adapter
Network adaptor for high-bandwidth applications, when EC2 standard networking is not enough
Elastic IP Address
A static public IP Address, with which you can do fancy things
AWS CloudFormation
Automated tool to deploy AWS services, including EC2, VPCs, etc etc etc
AWS Inspector
An automated security and compliance assessment (audit) tool.
Basically an audit tool
AWS CloudWatch
Agent?
Monitoring and alarm tool
“Evaluation period” is the number of data points to look at when deciding alarm state.
Monitors by default: CPU, Disk, Network, Status
Has an Agent
AWS ELB
types
Elastic Load Balancers distribute INCOMING traffic across multiple targets. Not outbound.
types = ALB, NLB, and GLB
Application Load Balancer, Network load Balancer, and Gateway Load Balancer
Billed by time.
NAT Gateway
Turns IP addresses into other addresses, can be used to multiple servers connect to internet through one IP address.
Is used for OUTBOUND internet communication, cannot receive unsolicited requests, only return requests.
IAM Access Key
Used for signing programmatic requests make to ASE
AWS STS
Security Token Service
Enables requesting temporary, limited, privilege credentials for AWSM IAM users
AWS Key Management Service
Create and manage cryptographic keys and control their use across a wide range of AWS services
FIPS 140-2 validated
AWS Certificate Manager
Lets you provision, manage, and deploy SSL/TLS certificates for use with AWS services
Signed URLs
What do they include?
“People with URL can access”, has expiration, can include IP range, technically includes some form of provider’s credentials.
There are also Signed Cookies.
OAI
Origin Access Identity (OAI)
Special CloudFront User, can limit S3/CloudFront content distribution to OAI users.
Cannot be used to restrict access to a ELB or EC2
AWS CloudHub
A way to connect multiple VPNs to your VPC, to connect various distributed networks.
Amazon Cognito
Used for adding sign-up, sign-in, and access control for mobile apps
Supports SAML and OpenIDConnect
CloudTrail
Multi-region or nah?
Audit tool for user activity and API usage
Set up “trails” to collect certain info
Can be multi-region
AWS Organizations
SCP
Connects accounts to centrally manage them.
Can use service control policies to allow/deny access to services or actions.
Security Groups
For servers not users.