Section 21: Databases in AWS Flashcards
1
Q
Data Types
P228
A
- RDBMS(=SQL): RDS,Aurora - Great for joins
- NoSQL database: DynamoDB(json), ElastiCache(key/value pairs), Neptune(graphs) -
no joins,no SQL - Object Store: S3(for big objects) / Glacier(for backups / archives)
- Data Warehouse:(= SQL Analytics): Redshift,Athena
- Search: ElasticSearch(Json) - Free text,unstructured search
- Graphs: Neptune - displays relationships between data
2
Q
RDS Overview
A
- Managed PostgreSQL/ MySQL/ Oracle/ SQL Server
- Must provision an EC2 instance and EBS volumes type and size(happens in backround)
- Support Read Replicas and Multi AZ
- Security through IAM, Security groups,SSL
- Backup/ Snapshot/ Point in time restore feature
- Managed and Scheduled maintenance
- Monitoring through Cloudwatch
- Use case: Store relational datasets, perform SQL queries,transactional inserts/ update/ delete
3
Q
Aurora Overview
A
- Compatible API for PostgreSQL / MySQL
- Data is held in 6 replicas, across 3 AZ
- Auto healing Capabilities
- Read replicas can be Global
- Aurora Serverless - for unpredictable / intermittent workloads
- Aurora Multi-Master - for continuous writes failover
- Use Case: same as RDS but with less maintenance, more felxibility, more performance
4
Q
ElasticCache Overview
A
- Managed Redis / Memcached
- In-memory data store,sub-millisecond latency
- Must provision an EC2 instance type
- Support for Clustering and Multi AZ, Read Replicas
- Security through IAM, Security Groups
- Backup / Snapshot / Point in tme restore feature
- Managed and scheduled maintenance
- Monitoring through Cloudwatch
- Use case: Key/Value stiore, frequent reads, less writes
5
Q
DynamoDB Overview
P232
A
- AWS proprietary technology, managed NoSQL database
- Serverless, provisioned capacity,auto scaling
- Can replace ElasticCache as a key/value store
- Highly Available. Multi AZ by default,Read and Write is decoupled,Dax for read cache
- Security,authentication is done by IAM
- DynamoDB Streams to integrate with AWS Lambda
- Backup/ Restore features
- Monitoring through CloudWatch
- Use Case: Serverless applications development,has transactions capabilities from Nov 2018
6
Q
Athena Overview
A
- Fully serverless database with SQL capabilities
- Used to query data in S3
- Pay per query
- Output result back to S3
- Secured through IAM
- Use Case: One time SQL queries, serverless queries on S3,log analytics
7
Q
Redshift Overview
A
- Redshift is based on PostgreSQL but its not used for OLTP(Online Transaction Processing)
- Its OLAP - online analytical processing(analytics and data warehousing)
- 10x better performance than other warehousing
- Columnar storage of data
- Massively Parallel Query Execution
- Has a SQL interface for performaning the queries
8
Q
What is AWS Glue
A
- Managed extract, transform and load service
- Useful to prepare and transform data analytics
- Fully serverless service
- Exampe: Extract data from S3 backet, transform it and loads on in Redshift Data warehouse
9
Q
What is AWS Neptune
P237
A
- Fully managed graph database
- When do we use Graphs?
1) High relationship data
2) Social network
3) Knowledge graphs - High available across 3 AZ
- Point in time recovery
- Support for KMS encryptions at rest
10
Q
What is AWS OpenSearch?
A
- AWS OpenSerach is successor to Amazon ElasticSearch
- Used to search and index of data
11
Q
CloudWatch Logs Agent & Unified Agent
A
- For virtual servers(EC2 instances ,on-premise servers)
1) Cloudwatch Logs Agent- Old version of the agent
- Can only send to CloudWatch Logs
2) CloudWatch Unified Agent
- Collect additional system-level metrics such as RAM, processes
- Collect logs to send to CloudWatch logs
- Centralized configurations using SSM Parameter Store
12
Q
What is AWS CloudTrail?
A
- Provides governance, compliance and audit for your AWS account
- Keeps an history of events/ PAI calls made with your AWS account.
- Can put logs from CloudTrail into CloudWatch Logs or S3.
13
Q
What 3 AWS CloudTrail events do you get?
P250
A
1) Management event
2) Data events