6. Databases Flashcards

1
Q

What is RDS and its 6 engines?

A

Relational Database Service. Incorporates SQL Server, ORACLE, MySQL, PostgreSQL, MariaDB and Amazon Aurora

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is and is not RDS good for?

A

Good for Online Transaction Processing, e.g. payments
Not good for Online Analytical Processing (OLTS) -> use Redshift instead

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q
  1. What would you use for RDS disaster recovery?
  2. Does it apply to all engines?
  3. How does it work?
A
  1. Multi-AZ. Multi-AZs are created by default.
  2. All except Amazon Aurora, which does not need it because it is distributed by definition.
  3. The primary is automatically replicated to Standby. The DB connection string is a URL address. Because Amazon handles DNS failover automatically, it will detect the downtime and automatically switch to the replica.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q
  1. What would you use for RDS High Availability?
  2. How does it work?
  3. What do you also have to do to use it?
A
  1. Amazon RDS Read Replicas
  2. Up to 5 replicas can be created from the AWS console. Can be across different Regions and AZs
  3. Enable automatic backups
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q
  1. What is Amazon Aurora?
  2. What are its important qualities?
A
  1. Amazon-proprietory RELATIONAL database, compatible with MySQL and PostgreSQL
    • a mix of high performance, scalability and availability
    • 6 copies of data: minimum 3 AZs, 2 copies each
    • Starts with 10GB, automatically scaled in 10GB increments up to 128TB
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are Amazon Aurora’s Replica types?

A
  • Aurora Replica: up to 15 read replicas ALSO: automated failover, self-healing (2 write/3 read)
  • MySQL Replica: up to 5 read replicas, slower replication time, cross-Region, manual failover. ALSO: supports user-defined replication delay and different data schemas between primary and secondary
  • PostgreSQL Replica: up to 5 read replicas
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What about Amazon Aurora’s backups and snapshots

A

Backups are automated, and always enabled; snapshots can be shared with other AWS accounts.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What would you do if you needed Aurora, but you have intermittent, infrequent or unpredictable workloads?

A

Use Amazon Aurora Serverless - on-demand, auto-scaling configuration. DB cluster automatically starts up, shuts down and scales.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the main qualities of DynamoDB?

A
  • NoSQL database; supports Document and Key-Value models
  • Stored on SSD, spreads across 3 geographically distinct data centres
  • Can be eventually consistent (~1sec) OR strongly consistent
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

When would you use
1. Aurora?
2. DynamoDB?
3. DAX?

A
  1. You need a relational database compatible with MySQL or PostgreSQL
  2. You need documents on the KV database for mobile, web, gaming, IoT
  3. Improve the read performance of DynamoDB
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

When would you use
1. Aurora?
2. DynamoDB?

A
  1. You need a relational database compatible with MySQL or PostgreSQL
  2. You need document on KV database for mobile, web, gaming, IoT
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is DAX?

A
  1. DynamoDB Accelerator. Fully managed highly available in-mem cache.
    10x performance improvements - request time in microseconds
    Application only needs to connect to DAX
    Pay-per-request pricing; BUT you pay more than with the provisioned capacity
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q
  1. What is DAX?
  2. What are its qualities?
A
  1. DynamoDB Accelerator. Fully managed highly available in-mem cache.
    10x performance improvements - request time in microseconds
    Application only needs to connect to DAX
  2. Pay-per request pricing; BUT you pay more than with the provisioned capacity
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q
  1. How do you deliver ACID in DynamoDB?
  2. Why would you need it?
A
  1. Use DynamoDB Transactions
  2. Financial transactions; fulfilling orders; multiplayer game engines; distributed processes;
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are DynamoDB Transactions options?

A

Read: Eventual Consistency, Strong Consistency and Transactional
Write: Standard and Transactional

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the concerns/trade-offs of DynamoDB Transactional Consistency?

A

Transaction size is 100 items or 4MB of data
Twice the cost, because DynamoDb needs to do “prepare” and “commit” for every item

17
Q

How do you secure DynamoDB?

A

Encryption at REST using KMS
Site-to-site VPN
Direct Connect
IAM Policies and Roles
Fine-grained access
Integrates with CloudWatch and CloudTail

18
Q

How do you ensure the durability of DynamoDB data?

A
  • DynamoDB backups: on-demand (same region), consistent with seconds and retained until deleted
  • Point-in-time Recovery: protects against accidental writes and deletes; 5 minutes in the past, restore to any point in the last 25 days (incremental backups); needs to be manually enabled
19
Q

How do we log DynamoDB data changes?

A

DynamoDB Streams:
- FIFO, time ordered
- streams broken into shards
- retained for 24 hours

20
Q
  1. How do we do multi-Region replication on DynamoDB?
  2. What do we need it for?
A
  1. Global Tables:
    - based on DynamoDB Streams
    - Can be enabled from ASW console: Table -> Global Tables -> Create Replica
    - any Region
    - Replication latency under 1 sec
  2. If we have globally distributed applications; DR or HA
21
Q
  1. How to run managed MongoDB cluster in AWS?
  2. How to run managed Apache Cassandra in AWS?
  3. How to migrate the above?
A
  1. Use Amazon DocumentDB
  2. Use Amazon Keyspaces
  3. Use AWS Database Migration Service
22
Q

How to implement a Graph database in AWS and why

A

Use Amazon Neptune for:
- identity graphs: social graphs, targeting, personalization, analytics
- knowledge graph applications: add topical data to product catalogues
- detect fraud patterns
- security graphs: visual infrastructure to plan, predict and mitigate risk

23
Q
  1. What is time-series data?
  2. Why would you need it?
  3. How to store it on Amazon?
A
  1. Data points logged over a period of time.
  2. Need to store large amounts of data for analysis. Examples:
    - temperature sensors
    - web traffic analytics
    - DevOps application monitoring
  3. Amazon Timestreams
    - trillions of events per day
    - 1,000x faster and 1/10th of const of relational databases
24
Q
  1. Why would you need Ledger Database?
  2. How would you implement it in AWS?
A

1:
- store financial transactions
- reconcile supply chain systems
- maintain claims history
- centralise digital records
why: has cryptographically verifiable transaction log BUT: owned by ONE authority (I assume, not distributed)

  1. Amazon Quantum Ledger Database (QLDB)