Databases Flashcards

1
Q

You can use ___________ to get information on the most accessed and most throttled items on a DynamoDB table or Global Secondary Index

A

CloudWatch Contributor Insights for Amazon DynamoDB

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

True or False: RDS can launch Multi-AZ databases, where multiple identical databases are spun in various AZs and when a request for data comes RDS dynamically determines which DB will be accessed

A

False, on the RDS Multi-AZ database you have one primary DB and multiple secondary DBs, and if your primary DB fails you can automatically switch over to the second database.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Whats the use-case difference between Read Replicas and Multi-AZ

A

Read Replicas are supposed to reduce the load on your main DB and help stop throttling and latency and are Asyncronous, while Multi-AZ is supposed to be used to mantain the reliability of the DB, reducing the chance it is off the air, and are synchronous. Additionally, read replicas can be Cross-Region, while Multi-AZ cannot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What RDS databases can have Cross-Region Replicas? Which ones can’t?

A

Can: Aurora, Oracle, MariaDB, PostgreSQL and MySQL
Can’t: SQL Server

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is a DynamoDB Global Table?

A

A global table is a group of DB replicas spread across multiple regions that are treated as a single DB, with all their data being automatically sinchronized between each other. All tables have the same primary key schema and table name.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is RPO and RTO?

A

RPO: Recovery Point Objective, how much data you lose in case of failure
RTO: Recovery Time Objective, how much time it takes to come back online after failure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the 4 most common disaster recovery strategies? Rank them by their cost and how much time it takes for them to act

A

Backup: Cheapest and slowest (few hours RTO). You create a snapshot if the data in a moment in time and recreate the DB based on this snapshot.
Pilot Light: 2nd Cheapest and slowest (10s of minutes RTO). Create a copy of your core workload infrastructure on another region, mantaining only essential replication services on. In case of disaster you switch to that solution.
Warm Standby: 2nd most expensive and fastest (minutes of RTO). Similar to pilot light, but you mantain a fully functional copy of the DB in another region.
Multi-site active: Fastest and most expensive. You run your DB in many regions simultaneously, making it have the fastest RTO but the most complex, as it creates issues of data corruption and synchronization.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are some examples of Pilot Light services?

A

-AWS DynamoDB Global Tables
-AWS Aurora DB Global Tables
-Amazon RDS Read Replicas

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

AWS Auto Scaling is compatible with which AWS Database services?

A

DynamoDB and AuroraDB

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the differences between OLTP and OLAP databases?

A

OLTP (Online Transaction Processing): Are databases focused on recording transactional operations. This involves lots of smal Insert, Delete and Update operations, and queries made to it are simple and short;
OLAP (Online Analytical Processing): Are databased focused on the analysis of the transations stored in OLTP DBs. They must be able to perform larger, analytical queries to create different summaries of multidimentional data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

List some aspects of Relational Database Management that RDS solves when compared to local DB solutions.

A
  • Hardware purchasing
  • OS and DB management and setup
  • Patching
  • Backups (Performed automatically every 5 min adn stored for 35 days)
  • Multi-AZ for reliability
  • Scaling Capacity
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

When creating an RDS DB what are the 2 main factors you need to decide for your DB?

A
  • The type of instance the DB will run on (gp2 or io1);
  • The DB engine to be used
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How does the Billing for RDS work?

A

There are 3 factors to RDS pricing:
- The cost of the underlying instance (affected by it being On-Demand or Reserved);
- The cost of storage (per gb) and I/o (per number of operations);
- The amount of data transfered to another region or to the internet.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are disadvantages Aurora has compared to RDS?

A
  • Accepts only PostgreSQL and MySQL
  • Is more expensive
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

True or False:
Amazon Aurora automatically backups your data to S3, allwoing you to revert to any point in time in the last 35 days.

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are some advantages of Aurora when compared with RDS?

A
  • Over 5x performance increase on MySQL and 3x performance increase on PostgreSQL
  • Highly Available by default, with 6 replicas across 3 AZs, with 4 copies being needed for write and 3 for reading and possessing “self-healing” property where corrupted data is fixed by reading from the copies
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

True or False: Both RDS and Aurora accept unlimited read-replicas, as long as you pay for them

A

False, max number of read replicas for both are 15

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

True or False: The billing for Aurora is the same as for RDS, with the only difference being that when paying for the underlying instance, Aurora also has the Serverless option, which makes it so you pay for the capacity of the DB instead.

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Classify the following statements regarding DynamoDB as True or False:
-DynamoDB is a Document based NoSQL database
-It’s max object size is 400kb
-It can be implemented in a provisioned strategy or on demand
-Read operations performed on it are always consistent
-Performs backups automatically and offers point-in-time restoration
-Has 2 table classes: Standard and infrequent access (IA)

A

-False: It’s a key-value NoSQL database
-True
-True
-False, reads performed are eventually consistent (Strong consistency)
-False, DynamoDB offers backups, but they aren’t performed automatically
-True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What the difference between a DynamoDB LSI and a GSI?

A
  • A Local Secondary Index (LSI) allows you to change your sort key, and must be created along with the table. It costs nothing.
  • A Global Secondary Index allows you to change bot your partitition key and your sort key, and can be defined after the table is created. It costs extra (counts as a new table over the same table in regards to provisioning reads ans writes)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

True or False: To use DynamoDB Global Tables you must enable DynamoDB Streams

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

How can you make it so a new DynamoDB entry is immediately deleted after a predefined amount of time?

A

You can define a TTL for your table entries

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

True or False: DynamoDB Streams can store data for up to a Week

A

False, up to 24 hours

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What are the use cases for DynamoDB DAX?

A

DAX is useful as a cache to DynamoDB for storing reads or queries requested recently (TTL 5min), solving the hot key problem. It is also Multi-AZ and secure (Encryption at rest with KMS)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

List the RDS database engine that can be accessed using IAM

A

MariaDB, MySQL and PostgreSQL

26
Q

What is the best way to convert an unencrypted RDS table into an encrypted one?

A

-Create an unencrypted snaphot
-Copy it enabling encryption
-Restore the encrypted snapshot

27
Q

Signal true or false for each statement below:
- You can use mysqldump to migrate an RDS MySQL DB to a non RDS MySQL DB.
-RDS Custom is available only for Oracle
-You can use either RDS Backups o Oracle RMAN (Recovery Manager) to restore a table to Amazon RDS for Oracle
-Both Oracle and SQL Server support Transparent Data Encryption (TDE) to encrypt data before it’s written to storage

A

-True
-False, it’s available to both Oracle and SQL Server
-False, Oracle RMAN can be used only to restore tables to non-RDS DBs
-True

28
Q

What service if RDS Proxy is conventionally used alongside to manage connections to RDS?

A

AWS Lambda. SInce each Lambda can spin up a connection to RDS it can leave a DB overloaded. RDS Proxy helps with this, managing the connections opened by Lambda without overloading the RDS. It also supports IAM authentication, DB authentication and autoscaling, and must be acessible py the lambda instance to be used.

29
Q

Assign True or False to the following statements:
-Aurora DB storage space scales automatically in increments of 1GB up to a maximum of 256TB
- You can configure each Aurora node to be either a Master node, responsible for writing data, or a Reader node, responsible for reading data
- There are 4 types of endpoints: Master/writer endpoints, which connect to master nodes, reader endpoints, which connect to and balance the connection to all reader nodes at the same time , custom endpoints, which connect to a specific subset of reader nodes and instance endpoints, which connect to a single instance
- Common uses for custom endpoints are running more demanding workloads on nodes that have the instance type necessary to support them

A

-False, it scales in increments of 10GB up to a max of 128TB
-False, by default Aurora has only on Master node and the rest are reading replica, with one replica being promoted to Master automaticaly in case of failure (about 30 secs of RTO)
-True
-True

30
Q

Explain Aurora Multi-Master mode

A

Aurora Multi-Master mode is a functionality of Aurora that allows you to convert all nodes into master nodes in case you need more availability.

31
Q

How does Aurora Global Database differ from normal Aurora?

A
  • 1 primary region for read/write with up to 5 secondary regions for only read (data replication lag of about 1 second)
  • Up to 16 read replicas in each secondary region
  • Promotion of another regeion to main region takes about 1 minute
32
Q

How can you convert an RDS DB to Aurora?

A

There are 2 ways:
- Create a Snapshot of the DB the restore it as an Aurora instance
- Create an Aurora read replica of a RDS db and promote the replica to a main instance

33
Q

If you want to convert from using Apache Cassandra to an AWS Cloud solution with minimal effort you should migrate to _________

A

AWS Keyspaces

34
Q

True or False: All AWS Database solutions have auto-scaling and high availability as a feature

A

True?

35
Q

Assign True or False to the following statements:
-AWS Keyspaces is Serverless
-AWS Keyspaces is compatible with CQL (Cassandra Query Language)
-Tables in AWS Keyspaces are encrypted by deault and stored in 3 AZs
-You can restore your data to any point in time in the last 35 days

A

-True
-True
-True
-True

36
Q

In which situations is it recommended to use DocumentDB over DynamoDB?

A

-When you are migrating from an architecture that uses MongoDB
-When you need to perform queries in on complex nested documents.

37
Q

True or False: The design principles of DocumentDB makes it akin to MongoDB as Aurora is to MySQL/PostgreSQL

A

True

38
Q

If you want to implement a Graph Database on AWS your service choice should be __________

A

Amazon Neptune

39
Q

Assign True or False to the following Statements:
-AWS Timeseries is a Serverless solution for storing IoT and Timeseries data
-1000x faster then a relational databases, but more expensive
-Use of tiering of storage of data to store historical data
-Possesses built-in Timeseries analytics functions
-Scales automatically and encrypts data at rest and in transit

A

-True
-False, its about 1/10 of the price of standard relational databases
-True
-True
-True

40
Q

What is an AWS Service that offers a serverless ledger service whose integrity and immutability is assured through blockchain?

A

Amazon Quantum Ledger Database (QLDB)

41
Q

Elasticache has two types of cache engines: _______, which is multi-AZ, supports ranking data with Sorted Sets and supports backups, and _______, which have no persistency or high-availability, are multi-node for data partitioning (sharding) and are good for Big Data thanks to Multi-Threading

A

-Redis
-Memcached

42
Q

Assign True or False to the following statements:
- Elasticsearch is an In-Memory database with high performance and low latency, and it’s useful at making applications stateless
- Some common uses for Elasticache are using it as a cache for databases and user session store for writing user session data
- Some common use strategies for updating the cache is using Write Through, where every time the cached data is updated at its source it’s also updated on Elasticache, and Lazy Loading, where data is only written to Elasticache on a cache miss
- Elasticsearch is plug and work, needing minimal code adjustement to work
- Is to managed Redis an Memcached as RDS is to managed DBs

A

-True
-True
-True
-False, Elasticsearch demands significante code rewrite for its usage
-True

43
Q

True or False: Elasticache supports authentication with IAM or user/pw for both Redis and Memcached

A

False, Memcached accepts only user/pw authentication and SASL authentication

44
Q

If a customer wants to use Redis not as a cache but as an In-Memory DB, is recommended he uses __________

A

Amazon MemoryDB for Redis

45
Q

Assign True or False to the following statements:
-The way amazon Redshift works is that when its leader node receives a query, it splits the query into jobs and assigns it to the compute nodes
- You can use Redshift Spectrum to easily transfer S3 data to Redshift without any cost
- Accepts a max of 10 nodes, with each supporting up to 10 TB of data
- Redshift is serverless, so you pay only by the amount of operations performed and the amount of data scanned by Redshift Spectrum
- By default, Redshift generates backups every 5 minutes or every 100MB of data stored, but this can be changed to happen on a Schedule

A
  • True
  • False, Redshift Spectrum is used to query data on Redshift and S3, and you pay by the ammount of scanned data
  • False, a Redshift Cluster can have over 100 nodes and each node can store up to 16TB of data
  • False, Redshift is not serverless, you have to provision the instances running in the cluster and you pay for them
  • False, by default redshift generates backups every 8 hours or 5 GB of data stored
46
Q

What are the Instance Billing Types for Amazon Redshift?

A
  • On-demand: Pay as you go
  • Reserved: Get a discount buy pledging to use the instances by at least 1 or 3 years
  • Concurrent Scaling: The price to be payed varies each instant with the amount of concurrent scaling added to the cluster.
47
Q

How does Redshift concurrent scaling work?

A

AWS automatically increases the number of instances on the clusters if it detects a large number of requests, billing the user per second. After the demand decreases, the extra instances go away.

48
Q

What is Redshift Workload Management (WLM)?

A

It’s a Redshift feature that allows you to manipulate query priority through the use of multiple query queues. Can be done through Automatic WLM, where the queue and resource management is done by Redshift, or Manual WLM, where it is done by the user

49
Q

What file types are supported by Amazon Athena?

A

-Csv
-Avro
-Parquet
-ORC
-Json

50
Q

What are some recommendations for increasing Athena performance?

A
  • Using columnar files to reduce scan time and costs (ORC, Parquet)
  • Compress data for smaller retrievals (snappy, gzip, etc)
  • Partition data on S3
  • Use larger files (>128MB) to reduce overhead
51
Q

Whats an Athena federated query?

A

It’s a query that allows you to search for data in multiple different sources (Dynamo, Redshift, RDS, etc) and store the results in S3

52
Q

How can you use Amazon Schema Conversion Tool (SCT) and Amazon Database Migration Service (DMS) together to perform a heterogeneous (different engines) database migration.

A

You use SCT to convert the schema from the original db format to the new db format and pass it to the new db created. After that, you just use DMS to import the data to the new db.

53
Q

Assign True or False to the following statements:
-When using DMS for a migration, the source Database becomes unavailable until the migration end
-The usage of SCT is unneded for homogeneous migrations
- DMS accepts all RDS database engines as source and more, such as DocumentDB, S3, MongoDB, SAP and DB2, on premise or not
- The DMS accepted targets are all RDS DBs, Redshift, S3, DynamoDB, Opensearch, Kinesis, Document DB and SAP, on premise or not
-Supports full load migrations, change data capture migrations (CDC) or both at the same time

A

-False, even while performing the migration the source DB remains available
-True
-True
-True
-True

54
Q

What is Redshift’s S3 Auto-Copy?

A

It’s a Redshift feature when new data stored on S3 is automatically copied into Redshift

55
Q

True or False: DMS can only perform Online data migrations

A

False, it can perform both Online and Offline

56
Q

What is the most recommended use case for DMS use?

A

Heterogeneous Migrations

57
Q

True or False: When you run a DMS job you need to create an EC2 instance for it to run on

A

True

58
Q

True or false: Snowball Edge and DMS are both services that can be used to transfer data, but they cannot be used with each other

A

False, DMS has Snowball Edge compatibility to speed up migration

59
Q

True or False: It is possible to create an RDS Proxy for Aurora, which connects only to Aurora Read Replicas

A

True

60
Q

What are the difference between Redis and Memcached on ElastiCache?

A

-Redis is Multi-AZ with auto-failover, Memcached is not
-Redis supports IAM authentication, Memcached not
-Redis can have read replicas for high availability, Memcached has no backup or restore and is not persistent
-Redis is good for Ranked Sets, Memcached is good for Big Data (Multithreaded and sharded)

61
Q

True or False: You ca change Elasticache Memcached to Redis without significant application code changes

A

False

62
Q

True or False: You can set Aurora RPO for MySQL DBs

A

False, you can set RPO for Aurora PostgreSQL DBs