Databases and Analytics Flashcards

1
Q

What does RDS stand for?

A

Relational database service

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the primary language of RDS?

A

SQL

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the name of Amazon’s proprietary database service?

A

Aurora

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Give 3 reasons why you might use RDS instead of deploying on EC2 yourself?

A
  • AWS will manage provisioning and patching
  • It will continuously backup and restore to specific timestamps
  • You will have a multi-AZ setup
  • It will be easy to scale both horizontally and vertically
  • You get monitoring dashboards
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Why might you use Aurora over RDS? Why not?

A

Use:
* Claimed to have a 5x performance improvement over MySQL on RDS
* 3x the performance of Postgres

Not:
* Aurora is 20% more expensive than standard RDS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What benefits could having a serverless version of Aurora bring (depending on use case of course)?

A
  • No capacity planning required
  • Little management overhead (all automatic)
  • Pay per second (could be more efficient depending on use case)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a read replica?

A

A duplicate of a database that is specifically made to increase read speeds by allowing there to be multiple sources from which applications can read

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Since applications are constantly reading from read replicas - are they also written directly to as well?

A

No - read replicas are only for reading from. All writing is done to the main database.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the positives and negatives of a multi-region read replica system?

A

Multi region read replicas allow multi-region applications to maintain fast write speeds as they can use a database that is closer to them for reading. This also means that there is a disaster recovery system if a region goes down.
However, there is a significant cost associated with this system that must considered in its adoption.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is a elasticache cache? Why would you use it?

A

An in-memory database with high performance and low latency.
Helps reduce the load off databases for read-intensive workloads.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is DynamoDB?

A

A fully managed, highly available database.
Serverless and thus hugely scalable.
Key-value and non-relational.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What tiering option is available for DynamoDB that is similar to S3 storage?

A

Standard and infrequent access table classes for cost saving

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

If you need an in-memory cache for DynamoDB, what service would you use?

A

DynamoDB accelerator (DAX). NOT elasticache - DAX is well integrated with DynamoDB

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are DynamoDB global tables?

A

Making DynamoDB tables accessible in multiple regions with low latency by setting up a 2 way replication for the table. You can edit the table in any region.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is Redshift?

A

A database service based on PostgreSQL used for online analytical processing and data warehousing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the redshift pricing model?

A

Pay as you go

17
Q

What is elastic map reduce (EMR)? What would you use it for?

A

A service that allows you to create Hadoop clusters and use them to process a vast amount of data. Used for ML, big data, data processing etc.

18
Q

What is Athena?

A

A serverless query service to perform analytics against S3 objects using SQL

19
Q

What is QuickSight?

A

A business intelligence service used to create interactive dashboards

20
Q

What is DocumentDB?

A

A database service used to store, query and index JSON data.

21
Q

What is Neptune?

A

A graph database service used to build and run applications working with highly connected datasets

22
Q

What is quantum ledger database (QLB)?

A

A database used to review the history of all the changes made to your application data over time (a ledger!!). Can manipulate with SQL

23
Q

What does the ETL in Glue ETL stand for?

A

Extract, transform and load

24
Q

What does Glue ETL do?

A

Allow us to prepare and transform data so that it is ready for analytics processing

25
Q

What is the Glue data catalog?

A

A catalog of datasets that can be brought in and used with Athena, Redshift, EMR etc.

26
Q

What is the Database Migration Service?

A

A service that is run on EC2 instances that allows for the movement of a database from one place to another.
Allows the source database to remain available throughout the migration process.

27
Q

What is database system is Redshift based on?

A

PostgreSQL

28
Q

What type of database is DynamoDB?

A

Key value database