Databases Flashcards
What are the benefits of databases?
- structure the data
- build indexes to efficiently query/search through the data
- define relationships between your datasets
- flexibility, scalability, high-performance, highly funtctional
Name a few examples of databases
key-value, document, graph, in-memory, search database
What do the AWS database-services provide?
- quick provisioning, high availability, vertical and horizontal scaling
- automated backup & restore, Operations, Upgrades
- Operating System Patching is handled by AWS
- Monitoring, alerting
Can you run a database on EC2 without using the database-services?
Yes, but you must handle yourself the resiliency, backup, patching, high availability, fault tolerance, scale etc.
Name the six engines of Amazon RDS
- Aurora
- PostgreSQL
- MySQL
- MariaDB
- Oracle Database
- SQL Server
Name three benefits of RDS
- automated backups
- database snapshots
- automatic host replacement
What does RDS provide?
- automated provisioning, OS patching
- Continuous backups and restore to specific timestamp (Point in Time Restore)!
- Monitoring dashboards
- Read replicas for improved read performance
- Multi AZ setup for DR (Disaster Recovery)
- Maintenance window for upgrades)
- Scaling capability (vertical and horizontal)
- Storage backed by EBS (gp2 or io!)
Can you SSH into your RDS database instance?
No
What do you know about Amazon Aurora?
- PostgreSQL and MySQL are supported
- 5 times performance improvement over MySQL on RDS
- cost 20% more than RDS but is more efficient
What do you know about Amazon Dynamo DB
- noSQL database -> key-value-pair
- millions of requests per second
- low latency
- integrated with IAM for security, authorization and administration
- low cost and auto scaling capabilities
What do you know about DynamoDB Accelerator - DAX?
- fully managed in-memory cache for DynamoDB
- 10x performance improvement - single-digit millisecond latency to microseconds latency
- secure, highly scalable & highly available
- difference with ElastiCache at the CCP level: DAX is only used for and is integrated with DynamoDB, while ElastiCache can be used for other databases
What do you know about Redshift?
- OLAP - online analytical processing (analytics and data warehousing)
- load data once every hour, not every second
- 10x better performance than other data warehouse, scale to PBs of data
- columnar storage of data (instead of row based)
- massively parallel query execution (MPP), highly available
- pay as you go based on the instances provisioned
- has a sql interface for performing the queries
- BI tools such as AWS Quicksight or Tableau integrate with it
What do you know about Amazon Elastic MapReduce (EMR)
- helps creating Hadoop clusters (Big Data) to analyze and process vast amount of data
- the clusters can be made of hundreds of EC2 instances
- take care of all the provisioning and configuration
- use cases: data processing, machine learning, web indexing, big data…
What do you know about Athena?
- fully serverless database with SQL capabilities
- used to query data in S3
- pay per query
- output results back to S3
- secured through IAM
- use case: one-time SQL queries, serverless queries on S3, log analytics
What do you know about Amazon QuickSight?
- serverless machine learning-powered business intelligence service to create interactive dashboards
- fast automatically scalable, embeddable, with per-session pricing
- integrated with RDS, Aurora, Athena, Redshift, S3