Databases Flashcards
What are disks vs. databases?
Disks: EFS, EBS, EC2 Instance Store, S3
Databases: RDS, Aurora, ElastiCache, DynamoDB, Redshift, EMR, Athena, QuickSight, DocumentDB, Neptune, QLDB, Managed Blockchain, DMS, Glue
What does a database let you do that a disk can’t?
Structure data, build indices to efficiently query/search data, and define relationships b/w your datasets
Benefits of no-SQL databases? What form do they take?
Flexible, scalable, high-performance, and high-functional
JSON
What is the benefit of using AWS databases?
AWS manages them:
- Quick Provisioning, High Availability, Vertical and Horizontal Scaling
- Automated Backup & Restore, Operations, Upgrades
- Operating System Patching is handled by AWS
- Monitoring, alerting
What are relational databases?
RDS & Aurora (SQL)
In-memory Database
ElastiCache
Key/Value Database? Can it be used serverless? What’s its caching mechanism?
DynamoDB (serverless)
DAX (cache for DynamoDB)
Data warehouse
Redshift (SQL)
Hadoop Cluster
EMR
How can you query data on Amazon S3?
What can it handle? (options below) servers severless sql no-sql
Athena
Serverless and can handle SQL
What is a managed Hyperledger Fabric & Ethereum blockchains?
Amazon Managed Blockchain
Managed ETL (Extract Transform Load) and Data Catalog service
Glue
DMS & traits?
Database migration system - allows intervals, quickly and securely migrates DB to AWS in a resilient and self healing, while allowing source database to remain available during the migration
Graph database
Neptune
dashboards on your data (serverless)
QuickSight
“Aurora for MongoDB” (JSON – NoSQL database)
DocumentDB
What allows you to create databases managed by AWS for the following: • Postgres • MySQL • MariaDB • Oracle • Microsoft SQL Server • Aurora (AWS Proprietary database)
RDS
What are the pros & cons to RDS vs. DB on EC2?
RDS is a managed service:
• Automated provisioning, OS patching
• Continuous backups and restore to specific timestamp (Point in Time Restore)!
• Monitoring dashboards
• Read replicas for improved read performance
• Multi AZ setup for DR (Disaster Recovery)
• Maintenance windows for upgrades
• Scaling capability (vertical and horizontal)
• Storage backed by EBS (gp2 or io1)
• BUT you can’t SSH into your instances
What are supported by Aurora DB (which is cloud-optimized)?
PostgreSQL and MySQL
What are 3 unique features of RDS deployments?
- Can read replicas to scale the read workload of your DB (1 app can read from 3 Amazon RDS - 1 main/2 replicas)
- Can do multi-AZ and failover in case of AZ outage (high availability); read/write to main RDS but rep across AZ to 1 other RDS
- Can do multi-region; if disaster in 1 region, can do local performance for global reads, ensure disaster recovery in case of region issues, but charge rep cost
What do you use ElastiCache for?
To get managed Redis or Memcached; caches are in-memory DB and allows reducing load of DB for read-intensive wkloads; allows quicker read/write from cache
What is a noSQL DB that can scale to massive workloads due to its distributed serverless DB with single-digit milisecond latency?
DynamoDB
What is a fully managed in-memory cache for DynamoDB that improves performance by 10X?
DAX
What is a postgreSQL based OLAP that has columnar storage of data?
Redshift
What helps create Hadoop clusters (Big Data) to analyze and process vast amount of data, where clusters could be hundreds of EC2 instances; and is used for data processing, machine learning, web indexing, and big data?
Elastic MapReduce
What is a fully Serverless database with SQL capabilities?
Athena
What are the use cases of Athena?
one-time SQL queries, serverless queries on S3, log analytics
What is the relationship between S3 and Athena?
Query data in S3 and get output to S3
What is the pricing of Athena?
Pay per query
What is a serverless machine learning-powered business intelligence service to create interactive dashboards?
Amazon QuickSight
What is AWS version of a NoSQL database for storing, querying, and indexing JSON data?
DocumentDB
What is a fully managed graph database (e.g. social network) that is highly available (3 AZ) and up to 15 read replicas?
Amazon Neptune
What can be used to review history of all the changes made to your application data over time, and is immutable?
Quantum ledger database
How is QLDB different from Amazon Managed Blockchain?
No decentralization component, in accordance with
financial regulation rules
What is a serverless AWS svc used to manage extract, transform, and load (ETL), i.e. useful for preparing and transforming data for analytics?
AWS Glue
What can Glue Data Catalogue be used by?
Athena, Redshift, EMR (Elastic MadReduce)
What svc can be used for quickly and securely migrating databases to AWS, resilient, self healing?
Database Migration Svc (DMS)