Databases 101 Flashcards

1
Q

What is a relational database analagous to?

A

A traditional spreadsheet

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What does RDS stand for?

A

Relational Database Servers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the 6 flavors of relational databases engines that are compatible with Amazon?

A
  • SQLServer
  • MySQLServer
  • Aurora
  • Oracle
  • PostgreSQL
  • MariaDB
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the 2 key features of RDS? What are they primarily for?

A
  • Multi-AZ (for distaster recovery)
  • Read Replicas (for performance)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Does RDS Multi-AZ require changing the connection string?

A

If you lose access to your primary database, AWS can automatically point incoming requests to your secondary database without changing the connection string

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How do Read Replicas work in RDS?

A
  • There is no automatic failover. If your primary instance goes down, you’ll need to create a new URL to connect to the secondary instance
  • All writes to the primary instance are copied over to the read replica(s)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Suppose you want to be able to handle a surge of incoming traffic to your RDS instance and scale out your application. What key feature of RDS would you use to do this?

A

Read Replicas

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the maximum number of read replicas that can be made from a single primary RDS instance?

A

up to 5 copies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a non-relational database best analagous to?

A

A JSON Object

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does OLTP stand for?

A

Online Transaction Processing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What does OLAP stand for?

A

Online Analytics Processing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the difference between OLTP and OLAP?

A
  • Big difference is in the types of queries you will run
  • OLTP is for a specific transaction
  • OLAP will pull in a lot of data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is Amazon’s data warehousing solution?

A

Redshift

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is Elasticache?

A

Elasticache is a web service that makes it easy to deploy, operate, and scale an in-memory cache in the cloud.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the benefit of using Elasticache?

A
  • Elasticached is used to speed up performance of existing databases
  • frequent identical queries
  • Elasticache improves the performance of web applications by allowing you to retrieve information from fast, managed in-memory caches, instead of relyinng entirely on slower, disk-based databases.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are the two types of open-source in-memory caching engines that Elasticache supports?

A
  • Memcached
  • Redis
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Does RDS run on virtual machines?

A

Yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

How can you log in to the OS of an RDS instance?

A

You can’t

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

How can you patch the OS or database of an RDS instance?

A

This is Amazon’s responsibility (you can’t)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Is RDS Serverless?

A

No, (except for Aurora)

RDS runs on virtual machines

21
Q

What is the use case for Elasticache?

A

Use Elasticache to increase DB and web application performance

22
Q

Does memcached support Multi-AZ?

A

No

23
Q

Does memcached support Backup and Restore?

A

No

24
Q

Does memcached support multi-threaded performance?

A

Yes

25
Q

Does redis support multi-threaded performance?

A

No

26
Q

Does redis support Multi-AZ?

A

Yes

27
Q

Does Redis support Backup and Restore?

A

Yes

28
Q

What is AWS DMS used for?

A

DMS allows you to migrate services from one source to AWS

29
Q

What is the difference between a homogenous migration and a heterogeneous migration?

A
  • Homogenous migrations go between the same database engines
  • Heterogenous migrations go between two different database engines
30
Q

What types of infrastructures can be used as a DMS Source?

A

The source can be:

  • On-premises
  • inside AWS itself
  • another cloud storage provider (like Azure)
31
Q

What does SCT stand for?

A

Schema Conversion Tool

32
Q

Do you need AWS SCT for a homogenous migration?

A

No

33
Q

Do you need AWS SCT for a heterogenous migration?

A

Yes

34
Q

What concerns need to be balanced when cacheing on AWS?

A

Cacheing is a balancing act between up-to-date information and latency

35
Q

What are the services on AWS that use cacheing?

A
  • CloudFront
  • API Gateway
  • Elasticache (memchached and redis)
  • DynamoDB Accelerator (DAX)
36
Q

What does EMR stand for?

A

Elastic Map-Reduce

37
Q

What is AWS EMR?

A

EMR is the industry-leading cloud big data platform for processing vast amounts of data using open-source tools like Apache Splunk, Apache hive, Apache HBase, Apache Flink, Apache Hudi, and Presto

38
Q

At what scale can AWS EMR run analysis?

A

EMR can run petabyte-scale analysis

39
Q

What is the cost-saving estimate for EMR over traditional on-premises solutions?

A

EMR offers analysis at less than half the cost of traditional on-premises solutions

40
Q

What is the estimated speed increase of EMR over Apache Spark?

A

3x faster

41
Q

What is a cluster in AWS EMR?

A

A collection of EC2 instances, each of which is called a node

42
Q

What are the three node types in EMR?

A
  • Master Node
  • Core Node
  • Task Node
43
Q

What is the purpose of a master node in EMR?

A

The master node manages the cluster and tracks the status of tasks.

Every cluster has a master node

44
Q

What is the purpose of a core node in EMR?

A

A core node runs tasks and stores data in HDFS.

Multi-node clusters have at least one core node

45
Q

What does HDFS stand for?

A

Hadoop Distributed File System

46
Q

What is the purpose of a task node in AWS EMR?

A

A task node runs tasks, but does NOT store data in HDFS

they are optional for your cluster.

47
Q

Suppose your EMR cluster wants to be able view log files even after the master node terminates. Is this possible?

A

Yes, you can configure a cluster to periodically archive the log files stored on the master node to S3.

This ensures log files are available after the cluster terminates, whether through a normal shutdown or due to an error.

Note that this is only possible when you are creating the cluster for the first time

48
Q

How often does EMR archive log files to S3?

A

5-minute intervals

49
Q

By default, how is log data stored in EMR?

A

By default, log data is stored on the master node