Relational Database Service (RDS) Flashcards
What are the six different RDS engines?
- Microsoft SQL Server
- Oracle
- MySQL
- PostgreSQL
- MariaDB
- Amazon Aurora
What are the advantages of RDS?
- You can provision and have a database up and running in minutes
- Multi-AZ
- Failover capacity (automatically)
- Automated backups
When would you use an RDS database?
Generally used for online transaction processing (OLTP) workloads, not analyzing large amounts of data (OLAP)
What is the difference between online transaction processing (OLTP) and online analytical processing (OLAP)?
OLTP processes data from transactions in real time (e.g. customer orders, banking transactions, payments, booking systems, etc.). All about data processing and completing a large number of small transactions in real time.
OLAP processes complex queries to analyze historical data (e.g. analyzing net profit figures from the last 3 years and sales forecasting). All about data analysis using large amounts of data, as well as complex queries that take a long time to complete.
If you are given a scenario which asks which service would be recommended for an OLTP application, what would you recommend?
RDS
If you are given a scenario which asks which service would be recommended for an OLAP application, what would you recommend?
Redshift
Explain how RDS handles multi-AZ.
RDS creates an exact copy of your database in another AZ (and continuously replicates the data as you write to the production database).
Which RDS types can be configured as Multi-AZ?
- Microsoft SQL Server
- Oracle
- MySQL
- PostgreSQL
- MariaDB
Which RDS type is always configured as Multi-AZ
Aurora
How does RDS handle unplanned failure or maintenance in an Multi-AZ configuration?
The applications are connecting to the database via a connection string (web address of the database, username and password). Amazon handles all the DNS for that web address. If there is an unplanned outage of the primary database, Amazon will detect the failure and point the web address at the standby database in another AZ (automated DNS failover handled by AWS) and that will be promoted to primary.
What is the main purpose for enabling Multi-AZ in RDS?
For disaster recovery, not performance (you cannot connect to the secondary database when the primary is active).
What is a read replica in RDS?
A read-only copy of your primary database.
Why would you use a read replica in RDS?
When you have read-heavy workloads to take the load off your primary database and boost performance. It is not used for disaster recovery.
Where can read replicas in RDS be stored?
In the same AZ, in a different AZ or even in a different region.
How do you address a read replica in RDS (as opposed to the primary database)?
A read replica has its own unique DNS endpoint, separate from the primary database.
Can you promote a read replica in RDS into its own independent database?
Yes, but this breaks the replication.
In what scenario would you promote a read replica in RDS to its own database?
When you are doing online analytics processing and you are about to do a massive query towards your database.
What feature in RDS must be enabled in order to deploy a read replica?
Automatic backups
How many read replicas in RDS are supported per database?
Up to 5
What is Amazon Aurora?
It is a MySQL and PostgreSQL compatible relational database engine that combines the speed and availability of commercial databases with the cost-effectiveness of open-source databases.
How much better performance can you expect from Amazon Aurora than MySQL or PostgreSQL?
5x better performance than MySQL and 3x better performance than PostgreSQL
What is the minimum storage size of Amazon Aurora and the maximum?
It starts with 10 GB and scales in 10 GB increments up to 128 TB (with storage auto-scaling).
How is the data replicated in Amazon Aurora?
2 copies of your data are contained in each AZ, with a minimum of 3 AZs (at least 6 copies of data)
What is it mean that Aurora storage is self-healing?
Data blocks and disks are continuously scanned for errors and repaired automatically.
How many copies of data can be lost without affecting database write availability in Amazon Aurora?
2 copies of data can be lost out of at least 6
How many copies of data can be lost without affecting database read availability in Amazon Aurora?
3 copies of data can be lost out of at least 6
What are the three types of Amazon Aurora read replicas available?
- Aurora replicas (up to 15 read replicas)
- MySQL replicas (up to 5 read replicas)
- PostgreSQL replicas (up to 5 read replicas)
What are the benefits of Aurora read replicas over MySQL and PostgreSQL read replicas?
- You can have up to 15 read replicas vs only 5
- It replicates faster
- It has a low impact on performance on the primary database
- They can act as a failover target with no data loss
Does Amazon Aurora support automated backups and snapshots?
Yes, automated backups are always enabled and do not impact performance. Snapshots are also available and do not impact performance and may be shared across accounts.
What is Amazon Aurora Serverless?
An on-demand, auto-scaling configuration for MySQL-compatible and PostgreSQL-compatible. Automatically starts up, scales down based on your application’s needs.
If you are given a scenario where you need the performance of Aurora, but you’re going to have spiky workloads, what RDS service would you recommend?
Amazon Aurora Serverless
What are the use cases for Aurora Serverless?
Relatively simple, cost-effective option for infrequent, intermittent, or unpredictable workloads.
What is DynamoDB?
- Fast, flexible NoSQL database for all applications that need consistent, single-digit millisecond latency at any scale.
- Fully managed
- Supports both document and key-value data models
Name some scenarios where DynamoDB would be a good fit?
Mobile, web, gaming, ad-tech, IoT, etc.
What type of storage is used for DynamoDB?
SSD
What type of read consistency is supported by DynamoDB by default?
Eventually consistent reads (gives the best read performance), but you can opt in for strongly consistent reads, if needed.
How is availability handled in DynamoDB?
It is spread across three geographically distinct data centers.
What is DynamoDB Accelerator (DAX)
Fully managed, highly available, in-memory cache with 10X performance improvements, and no need for developers to manage caching logic.
How does DynamoDB Accelerator work?
Your application connects directly to DAX and if DAX determines there isn’t a cache hit, it queries DynamoDB for you.
What is the pricing scheme for DynamoDB Accelerator (DAX)?
Pay-per-request pricing, with no minimum capacity
When would you consider using DynamoDB Accelerator (DAX)?
Useful for new product line launches, for example.
What security features are available with DynamoDB?
- Encryption at rest using KMS
- Site-to-site VPN
- Direct Connect (DX)
- IAM policies and roles
- Allows Fine-grained access
- Integrates with CloudWatch and CloudTrail
- Integrates with VPC endpoints
What does ACID mean in terms of database transactions?
All or nothing
- ATOMIC (all changes to the data must be performed successfully or not at all)
- Consistent (data must be in a consistent state both before and after the transaction)
- Isolated (no other process can change the data while the transaction is running)
- Durable (changes from a transaction must persist)
Can you use ACID methodology with DynamoDB?
Yes, you must use transactions when building applications that require coordinated inserts, deletes or updates to multiple items as part of a single logical business operation (one or more tables within a single AWS account and region)
If you are given a scenario where they are talking about having atomicity, consistency, isolation and durability and you want to do this with DynamoDB, what feature should be enabled?
DynamoDB transactions
List some use cases for DynamoDB transactions.
- Processing financial transactions
- Fulfilling and managing orders
- Building multiplayer game engines
- Coordinating actions across distributed components and services
What are the three options for reads in DynamoDB?
- Eventual consistency
- Strong consistency
- Transactional
What are the two options for writes in DynamoDB?
- Standard
- Transactional
What is the maximum number of items that can be updated in one transaction in DynamoDB?
25 items (or 4 MB of data)
What is DynamoDB On-Demand Backup and Restore?
It does full backups at any time with zero impact on table performance and availability, in the same region as the source table
How long are DynamoDB On-Demand Backups stored?
They are retained until they are deleted.
What is Point-in-Time Recovery in DynamoDB?
It protects against accidental writes or deletes and can restore to any point in the last 35 days and its done using incremental backups.
Is Point-in-Time Recovery in DynamoDB enabled by default?
No
What is the latest restorable time using Point-in-Time Recovery in DynamoDB?
5 minutes in the past
What are DynamoDB Streams?
Time ordered sequence of item-level changes (inserts/updates/deletes) in a table
How long does data get stored in DynamoDB Streams?
24 hours
How can you achieve functionality like stored procedures in DynamoDB?
Using DynamoDB Streams combined with Lambdas (may not be an exam topic)
What are global tables in DynamoDB?
They are managed multi-master, multi-region replication (for disaster recovery of high availability), based on DynamoDB Streams.
List use case for using global tables in DynamoDB.
If you have a globally distributed application across the world
What feature in DynamoDB do you need enabled to enable global tables?
DynamoDB Streams
What is the replication latency for global tables in DynamoDB?
1 second
What is MongoDB?
A document database that allows for scalability and flexibility with your data as well as robust querying and indexing features.
What is Amazon DocumentDB?
A managed service that allows you to run MongoDB in the AWS cloud. It scales with your workload and safely stores your information.
If you are given a scenario where you have an existing MongoDB database and you don’t want to have to refactor to move to AWS cloud, what service would you recommend?
Amazon DocumentDB
How can you move an existing MongoDB database on-premises to AWS cloud?
Use AWS Migration Service to automate your database migration to Amazon DocumentDB
What is Cassandra?
A distributed database (i.e. it runs on many machines) that uses NoSQL. It’s primarily used for big data solutions. Enterprises, such as Netflix, use Cassandra on their backend.
What is Amazon Keyspaces?
A fully managed serverless service that allows you to run Cassandra in the AWS cloud.
If you are given a scenario where you have an existing Cassandra database and you don’t want to have to refactor to move to AWS cloud, what service would you recommend?
Amazon Keyspaces
What is a Graph database?
Data is stored just like you might sketch ideas on a sketch board. It stores nodes and relationships instead of tables or documents.
What is Amazon Neptune?
A fully managed graph database service
Name some use cases for using Amazon Neptune.
- Build connections between identities (e.g. social graphs and accelerate updates for ad targeting, personalization and analytics)
- Build knowledge graph applications (e.g. add topical data to product catalogs, and help users quickly navigate highly connected datasets)
- Detect fraud patterns (in financial and purchase transactions)
- Security graphs to improve IT security (proactively detect and investigate IT infrastructure using the layered security approach; visualize all infrastructure to plan, predict and mitigate risk)
If you are given a scenario where you have a need for a graph database, what service would you recommend?
Amazon Neptune
What is a ledger database?
It is a NoSQL database that is immutable, transparent, and has cryptographically verifiable transaction log that is owned by one authority. You cannot update a record; instead, an update adds a new record to the database.
What is the most common use cases for ledger databases?
- It is used for cryptocurrencies such as Bitcoin, Ethereum, etc. (for transactions on the blockchain)
- Shipping companies use it to track items, boxes, shipping containers, deliveries, etc.
- Pharmaceutical companies use it to track creation and distribution of drugs and ensure no counterfeits are produced
What is Amazon Quantum Ledger Database (QLDB)?
It is a fully managed ledger database that provides a transparent, immutable and cryptographically verifiable transaction log.
What are common use cases for Amazon Quantum Ledger Database (QLDB)?
- Store financial transactions
- Reconcile supply chain systems
- Maintain a claims history
- Centralize digital records
If you are given a scenario where you have an immutable database, what service would you recommend?
Amazon Quantum Ledger Database (QLDB)
What is Time-Series Data?
Data points that are logged over a series of time, allowing you to track your data. Examples could include temperature readings from weather stations around the world.
What are some examples of Time-Series Data?
- IoT sensors relay thousands, millions and billions of points of information depending on the setup. One use case is for agriculture.
- Analytics (large websites such as Netflix need to analyze incoming and outgoing web traffic
- DevOps applications (applications that change in response to users needs may need to be monitored continuously so they can scale correctly)
What is Amazon Timestream?
A serverless, fully managed database service for time-series data.
If you are given a scenario where you need to store a large amount of time-series data for analysis, what service would you recommend?
Amazon Timestream
If you are given a scenario where it is talking about scaling issues with your database or bad read performance, what service would you recommend?
Read Replicas
If you are given a scenario where it is talking about disaster recovery with your database, what service would you recommend?
Multi-AZ
If you are given a scenario where it talks about needing a serverless relational database, what service would you recommend?
Aurora Serverless