Databases 101 Flashcards

1
Q

What is a relational database compare to?

A

A traditional spreadsheet

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What does RDS stand for?

A

Relational Database Servers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the 6 flavors of relational databases engines that are compatible with Amazon?

A
  • SQL
  • MySQL
  • Aurora
  • Oracle
  • PostgreSQL
  • MariaDB
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the 2 key features of RDS and what are they for?

A
  • Multi-AZ (for distaster recovery)
  • Read Replicas (for performance)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What happens when you lose connection to primary db?

Does RDS Multi-AZ require changing the connection string?

A

If you lose access to your primary database, AWS can automatically point incoming requests to your secondary database without changing the connection string

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How do Read Replicas work in RDS?

A
  • There is no automatic failover. If your primary instance goes down, you’ll need to create a new URL to connect to the secondary instance
  • All writes to the primary instance are copied over to the read replica(s)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Suppose you want to be able to handle a surge of incoming traffic to your RDS instance and scale out your application. What key feature of RDS would you use to do this?

A

Read Replicas

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the maximum number of read replicas that can be made from a single primary RDS instance?

A

up to 5 copies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a non-relational database best analagous to?

A

A JSON Object

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does OLTP stand for?

A

Online Transaction Processing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What does OLAP stand for?

A

Online Analytics Processing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the difference between OLTP and OLAP?

A

Big difference is in the types of queries you will run

Online transaction processing (OLTP) captures, stores, and processes data from transactions in real time.

Online analytical processing (OLAP) uses complex queries to analyze aggregated historical data from OLTP systems.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is Amazon’s data warehousing solution?

A

Redshift

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is Elasticache?

A

Elasticache is a web service that makes it easy to deploy, operate, and scale an in-memory cache in the cloud.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the benefit of using Elasticache?

A
  • Elasticached is used to speed up performance of existing databases
  • frequent identical queries
  • Elasticache improves the performance of web applications by allowing you to retrieve information from fast, managed in-memory caches, instead of relying entirely on slower, disk-based databases.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are the two types of open-source in-memory caching engines that Elasticache supports?

A
  • Memcached
  • Redis
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Does RDS run on virtual machines?

A

Yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

How can you log in to the OS of an RDS instance?

A

You can’t

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

How can you patch the OS or database of an RDS instance?

A

This is Amazon’s responsibility (you can’t)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Is RDS Serverless?

A

No, (except for Aurora)

RDS runs on virtual machines

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is the use case for Elasticache?

A

Use Elasticache to increase DB and web application performance

22
Q

Does memcached support Multi-AZ?

A

No

23
Q

Does memcached support Backup and Restore?

A

No

24
Q

Does memcached support multi-threaded performance?

A

Yes

25
Q

Does redis support multi-threaded performance?

A

No

26
Q

Does redis support Multi-AZ?

A

Yes

27
Q

Does Redis support Backup and Restore?

A

Yes

28
Q

What is AWS DMS used for?

A

DMS allows you to migrate services from one source to AWS

29
Q

What is the difference between a homogenous migration and a heterogeneous migration?

A
  • Homogenous migrations go between the same database engines
  • Heterogenous migrations go between two different database engines
30
Q

What types of infrastructures can be used as a DMS Source?

A

The source can be:

  • On-premises
  • inside AWS itself
  • another cloud storage provider (like Azure)
31
Q

What does SCT stand for?

A

Schema Conversion Tool

32
Q

Do you need AWS SCT for a homogenous migration?

A

No

33
Q

Do you need AWS SCT for a heterogenous migration?

A

Yes

34
Q

What concerns need to be balanced when cacheing on AWS?

A

Cacheing is a balancing act between up-to-date information and latency

35
Q

What are the services on AWS that use cacheing?

A
  • CloudFront
  • API Gateway
  • Elasticache (memchached and redis)
  • DynamoDB Accelerator (DAX)
36
Q

What does EMR stand for?

A

Elastic Map-Reduce

37
Q

What is AWS EMR?

A

EMR is the industry-leading cloud big data platform for processing vast amounts of data using open-source tools like Apache Splunk, Apache hive, Apache HBase, Apache Flink, Apache Hudi, and Presto

38
Q

At what scale can AWS EMR run analysis?

A

EMR can run petabyte-scale analysis

39
Q

What is the cost-saving estimate for EMR over traditional on-premises solutions?

A

EMR offers analysis at less than half the cost of traditional on-premises solutions

40
Q

What is the estimated speed increase of EMR over Apache Spark?

A

3x faster

41
Q

What is a cluster in AWS EMR?

A

A collection of EC2 instances, each of which is called a node

42
Q

What are the three node types in EMR?

A
  • Master Node
  • Core Node
  • Task Node
43
Q

What is the purpose of a master node in EMR?

A

The master node manages the cluster and tracks the status of tasks.

Every cluster has a master node

44
Q

What is the purpose of a core node in EMR?

A

A core node runs tasks and stores data in HDFS.

Multi-node clusters have at least one core node

45
Q

What does HDFS stand for?

A

Hadoop Distributed File System

46
Q

What is the purpose of a task node in AWS EMR?

A

A task node runs tasks, but does NOT store data in HDFS

they are optional for your cluster.

47
Q

Suppose your EMR cluster wants to be able view log files even after the master node terminates. Is this possible?

A

Yes, you can configure a cluster to periodically archive the log files stored on the master node to S3.

This ensures log files are available after the cluster terminates, whether through a normal shutdown or due to an error.

Note that this is only possible when you are creating the cluster for the first time

48
Q

How often does EMR archive log files to S3?

A

5-minute intervals

49
Q

By default, how is log data stored in EMR?

A

By default, log data is stored on the master node

50
Q

Can Elasticache be used for Pub/Sub type systems?

A

Yes

51
Q

Can you encrypt an unencrypted database snapshot?

A

No