Databases & Analytics Flashcards
What is AWS’s responsibility with databases?
AWS offers managed databases. Meaning they handle operations, upgrades, patches, monitoring, alerts, and backups
What does RDS stand for?
Relational Database Service
What query language do AWS RDS use?
SQL
What types of databases can you create using AWS RDS?
- Postgres
- MySQL
- MariaDB
- Oracle
- Microsoft SQL Server
- Aurora (AWS Proprietary database)
Advantages of using RDS versus deploying DB on ECS?
- Automated provisioning, OS patching
- Continuous backups and restore to specific timestamp (Point In Time Restore)
- Monitoring dashboards
- Read replicas for improved read performance
- Multi AZ setup for DR (Disaster Recovery)
- Maintenance windows for upgrades
- Scaling capability (vertical and horizontal)
- Storage backed by EBS (gp2 or io I)
What is the only disadvantage of using RDS?
You can’t SSH into your instance
What is Amazon Aurora?
A proprietary technology from AWS. It is AWS cloud optimized and claims 5x performance improvement over MySQL on RDS, over 3x the performance of Postgres on RDS
Tell me a bit about Aurora storage
Aurora storage automatically grows in increments of 10GB, up to 64TB
How does Aurora cost compare to RDS?
Aurora costs more than RDS by 20%, but it is more efficient
Is RDS in the AWS free tier?
Yes
Is Aurora in the AWS free tier?
No
What are RDS read replicas?
Scale the read workload of your DB. Data is only written to the main DB
How many read replicas can you have on AWS RDS?
15
What is RDS Multi-AZ?
Replication across AZ. It is the failover in case of issues with main DB. You read and write both to the main DB
How many Multi-AZ replicas can you have?
1
What is RDS Multi-Region replicas?
Read replicas in other regions. Allow for disaster recovery and faster load performance for global reads. Still write to main.
What is the extra cost associated with RDS Multi-Region replicas?
There is a replication cost since it is cross region
What is Amazon elastiCache?
Used to get managed Redis or Memcached databases.
What are caches?
In-memory databases with high performance, low latency. Helps reduce load off databases for read intensive workloads
Which service should you use if you use if you need high performance, low latency, in memory databases
Amazon ElastiCache
Which is faster, RDS or ElastiCache?
ElastiCache is faster
What is DynamoDB?
Fully managed highly available with replication across 3 AZ. Scales to massive workloads, distributed ‘serverless’ database. Millions of requests per second, trillions of rows, 10s of TB of storage. NoSQL database
How fast is DynamoDB?
Fast and consistent performance. Single-digit millisecond latency - low latency retrieval
What type of data goes in DynamoDB?
Key/value
What is DynamoDB Accelerator (DAX)?
Fully managed in-memory chache for DynamoDB.
How fast is DynamoDB Accelerator (DAX)?
10x performance - single digit millisecond latency to microseconds latency when accessing your DynamoDB tables
What is the difference between ElastiCache and DAX at CCP level?
DAX is only used for and is integrated with DynamoDB, while ElastiCache can be used for other databases
What is Redshift?
Redshift is based on PostgreSQL, but it’s not used for OLTP. Data warehouse service. It is OLAP
Columnar storage of data (instead of row based)
Massive Parallel Query Execution (MPP) helps do computations very quickly making it highly available
What is OLTP?
Online Transaction Processing
What is OLAP?
Online Analytical Processing (analytics and data warehousing)
How often does Redshift load data?
Once every hour
What type of performance does Redshift have over other data warehouses?
10x better performance
What scale does Redshift scale to?
PBs of data
How do you pay for Redshift?
Pay as you go
Does Redshift have a SQL interface for performing the queries?
Yes
Does Redshift integrate with BI tools such as AWS Quicksight and Tableau
Yes
What is Amazon EMR?
Helps create Hadoop clusters (Big Data) to analyze and process vast amounts of data
What does EMR stand for?
Elastic MapReduce