Databases Flashcards

Question 1

Q

Does DynamoDB have query join capability?

Question 2

Q

Does DynamoDB support aggregations such as SUM or AVG?

Question 3

Q

How do NoSQL databases scale?

Answer

A

Horizontally

Question 4

Q

Instead of rows and columns, what is the lexicon used in Dynamo DB?

Answer

A

Items = Rows
Attributes = columns

Question 5

Q

What is the maximum size for an item in DynamoDB?

Question 6

Q

Using DynamoDB, what is the accepted solution to store BLOB data?

Answer

A

Store the object in S3 and store the metadata

Question 7

Q

What is provisioned mode in DynamoDB?

Answer

A

You specify the number of reads and writes and pay for the provisioned capacity regardless if it is all used.

Question 8

Q

Can you auto-scale using DynamoDB provisioned capacity mode?

Answer

A

No. You must use on-demand mode.

Question 9

Q

What is cheaper, DynamoDB provisioned or on-Demand mode?

Answer

A

Provisioned mode.

Question 10

Q

What can be used in DynamoDB provisioned capacity to temporarily exceed throughput?

Answer

A

Burst capacity. Once exceeded, a ProvisionedThroughputExceededException will be thrown.

Question 11

Q

When reaching a ProvisionedThroughputExceededException error in DynamoDB, what is the prferred solution?

Answer

A

An exponential backoff retry

Question 12

Q

In DynamodDB, how large is 1 WCU?

Answer

A

1 WCU = 1 write up to 1KB in size per second.

Question 13

Q

What are the two types of reads in DynamoDb?

Answer

A

Strongly consistent and eventually consistent reads.

Question 14

Q

What is an eventually consistent read in DynamoDB?

Answer

A

A read that has the potential to get stale data because the write has not replicated to all the servers on the backend yet.

Question 15

Q

What is a strongly consistent read in dynamoDB?

Answer

A

It returns data only after it has been fully replicated on the backend.

Question 16

Q

How do you set a strongly consistent read in DynamoDB?

Answer

A

Set the consistentRead parameter to TRUE in API calls.

Question 17

Q

How many RCUs is a strongly consistent read?

Answer

A

It is twice the cost of an eventually consistent read.

Question 18

Q

In DynamoDB, how large is 1RCU?

Answer

A

One strongly consistent read or two eventually consistent reads per second for items up to 4KB in size.

Question 19

Q

In DynamoDB, how are WCUs and RCUs spread?

Answer

A

evenly amongst all partitions

Question 20

Q

In DynamoDB, what is a projectedExpression?

Answer

A

It can be specified to only retrieve certain attributes.

Question 21

Q

Can a FilterExpression be used with key attributes in DynamoDB?

Answer

A

No, You cannot use this with HASH or Range Attributes.

Question 22

Q

In DynamoDB, what is the max value of data allowed to be returned in a Query API call?

Question 23

Q

How does the SCAN operation work in DynamoDB?

Answer

A

It loads the entire table and then filters out the data you want. This is inefficient.

Question 24

Q

How can you improve DynamoDB scan performance if you must use it?

Answer

A

Parallel Scans

Question 25

Q

Can you retry Items that fail in Batch operations in DynamoDB?

Question 26

Q

How many PutItem or DeleteItems can be present in 1 batchWriteItem API request

Answer

A

25 items or 16MB of data. Still has a 400KB per item maximum

Question 27

Q

What is PartiQL in DynamoDB?

Answer

A

It is a SQL compatible query language for dynamodDB. It can only handle CRUD. Joins are still not possible.

Question 28

Q

Are dynamoDB filters performed on the server or client side.

Answer

A

Client side.

Question 29

Q

What is an LSI in Dynamo DB?

Answer

A

It uses the same partition key as the base table, but you get an additional sort key.

Up to 5 LSI per table

Must be defined at table creation

Question 30

Q

What is an GSI in Dynamo DB?

Answer

A

It uses an alternative primary key.

Requires WCU and RCU

Can be added AFTER table creation

Question 31

Q

What happens in DynamoDB when the GSI is throttled?

Answer

A

The main table will also be throttled.

Question 32

Q

When it comes to WCU and RCU with LSI and GSI, what is the big difference?

Answer

A

GSI has provisioned capacity

LSI uses the RCU AND WCU of the main table

Question 33

Q

What is DynamoDB DAX?

Answer

A

Seamless in-memory cache for DynamoDB.. Solves HOT key problem.

Question 34

Q

How many DynamoDB DAX nodes can be in a cluster?

Question 35

Q

How many AZs should you use for production?

Answer

A

A minimum of three

Question 36

Q

When would you use Elasticache instead of DAX with DynamoDB?

Answer

A

When you have an aggregate cached result.

Question 37

Q

What is DynamoDB Streams?

Answer

A

It is an ordered stream of item-level modifications.

Question 38

Q

What AWS services can consume a DynamoDB stream?

Answer

A

Lambda, Kinesis Data Streams, Kinesis Client Library

Question 39

Q

What is the max retention for DynamoDB Streams?

Question 40

Q

What are the DynamoDB stream types?

Answer

A

KEYS_ONLY - Key attributes of the modified item

NEW_IMAGE - Entire Item as it appears after it was modified

OLD_IMAGE -Entire Item as it appears before it was modified

NEW_AND_OLD_IMAGES - Both new and old images of the item

Question 41

Q

What is the max time an expired item is held in DynamoDB?

Question 42

Q

Does DynamoDB have backup and restore capability?

Answer

A

Yes. It has PITR like RDS

Question 43

Q

What are DynamoDB Global Tables?

Answer

A

Multi-Region, Multi-Active, fully replicated tables.

Question 44

Q

What is DynamoDB Local?

Answer

A

Allows you to develop and test apps locally without accessing the web service.

Question 45

Q

Does DynamoDB support Federated Logins?

Question 46

Q

Is RDS ACID compliant?

Answer

A

Yes. All database services are compliant

Question 47

Q

What is the maximum database volume size in Aurora?

Question 48

Q

What is the maximum amount of read replicas in Aurora?

Question 49

Q

Can Aurora backup to S3?

Answer

A

Yes, continuous backup to S3 is available.

Question 50

Q

Using Aurora, if you want automatic scaling; what version must you use for automatic scaling?

Answer

A

Aurora Serverless

Question 51

Q

What are the two types of LOCKS in RDS?

Answer

A

Shared locks - Allows reads and prevents writes

Exclusive Locks: prevent all reads and writes to a resource.`

Question 52

Q

What should the TTL on your DB instance DNS be to support failover?

Answer

A

30 seconds or less.

Question 53

Q

What is DocumentDB?

Answer

A

It is a NoSQL database similar to MongoDB. JSON Based.

Question 54

Q

In what increments does DocumentDB grow?

Answer

A

10GB Increments

Question 55

Q

What is Amazon MemoryDB for Redis?

Answer

A

It is a Redis compatible, durable, in-memory database service.

Question 56

Q

What is Amazon Keyspaces?

Answer

A

Managed Apache Cassandra NoSQL distributed database.

Question 57

Q

What language do you use to query Amazon Keyspaces?

Answer

A

CQL - Cassandra Query Language

Question 58

Q

What is Amazon Neptune?

Answer

A

A fully managed graph database.

Question 59

Q

Is Redshift geared for OLAP or OLTP?

Question 60

Q

What connection types does Redshift support?

Answer

A

ODBC, JDBC

Question 61

Q

What are the types of nodes used in a Redshift Cluster?

Answer

A

Leader node and compute nodes. Leader nodes create execution plans and delegate the work to the compute nodes.

Question 62

Q

What is the maximum amount of compute nodes you can have in a Redshift Cluster?

Answer

A

128 compute nodes

Question 63

Q

Does each compute node in Redshift have its’ own compute, memory, and storage?

Answer

A

Yes. This is dependent on the type you choose though.

Question 64

Q

What node type would you use if your want to optimize for storage capacity in Redshift?

Answer

A

Dense Storage nodes. These use HDD volumes

Answer 53

A

Dense compute nodes. These use SSD volumes

Answer 54

A

xlarge or 8xlarge

Answer 55

A

Node slices, these use a portion of the resources assigned to the compute node to perform a task.

Answer 56

A

It allows you to query data in S3 (datalake) and allows you to join to your redshift tables.

Answer 57

A

Gzip and Snappy

Answer 58

A

It uses MPP, massive parallel processing, columnar data storage, and column compression.

Answer 59

A

It replicates in the cluster, has automated snapshots, and replicates to S3

Answer 60

A

They are automatically replaced.

Answer 61

A

No. There is nowhere to replicate to.

Answer 62

A

You must use an RA3 cluster.

Answer 63

A

Vertically and horizontally. A new cluster is created while your old one is available for reads. The CNAME is flipped to the new cluster and data is moved in parallel to new compute nodes.

Answer 64

A

Auto - Based on the size of your data
Even - Distributed across slices in round-robin
Key - Rows are distributed based on a single column
All - The entire table is copied to every node

Answer 65

A

They are like indexes and make for fast range queries.

Answer 66

A

A single column to sort the data

Answer 67

A

Made of all columns in the sort key definition. Order is important. Default sort type.

Answer 68

A

Gives equal weight to each sort key in the list.

Answer 69

A

use the COPY command. When using S3 it will need a manifest file.

Answer 70

A

The UNLOAD command.

Answer 71

A

When it is enabled, traffic uses the AWS backbone. When it is not enabled, it routes through the internet.

Answer 72

A

It automatically copies files loaded into S3 into RedShift

Answer 73

A

It automatically replicates from Aurora to Redshift.

Answer 74

A

It loads data from a Kinesis Data Stream or MSK

Answer 75

A

INSERT INTO or CREATE TABLE AS

Answer 76

A

Yes, it can do this as it is being loaded into RedShift.

Answer 77

A

Load it using a single COPY command. Do not break this up.

Answer 78

A

You use a copy grant. You create a KMS key, procide a unique namem and specify the KMS ID in the destination region. You enable copying in the source region.

Answer 79

A

It connects Redshift to PostgreSQL

Answer 80

A

Yes. You can load tables using COPY

Answer 81

A

It prioritizes short fast queries vs. long and slow queries.

Answer 82

A

It automatically adds cluster capacity to handle increase n concurrent read queries.

Answer 83

A

short queries get more concurrency and long queries get little. This is configurable though and concurrency scaling can be used.

Answer 84

A

They monitor your queries and will abort them if they are running longer than allowed.

Answer 85

A

No. This must be done in manual mode. Hopping means that you are sending the query to a different queue because it timed out.

Answer 86

A

Short Query Acceleration. This can be used in place of WLM if you only want to accelerate short queries. Can be used with Create Table As and read only queries. You configure the value of what “short” is.

Answer 87

A

Recovers space from deleted rows

Answer 88

A

FULL - Default. Resorts and reclaims space
DELETE ONLY - Only reclaims space
SORT ONLY - Only resorts records
REINDEX - Re-Analyze interleaved

Answer 89

A

It allows you to add or remove nodes of the same type.

The cluster goes down for a few min, but it tries to keep connections open.

Answer 90

A

Allows you to change node types or number of nodes

Answer 91

A

Snapshot, restore, and then resize before changing the primary.

Answer 92

A

It UNLOADs your query to S3 in Apache Parquet format. 2x faster and 6x more compressed

Answer 93

A

Redshift Data Lake Export

Answer 94

A

It has managed storage. SSD based. Can be independently scaled.

Answer 95

A

Geometry and Geography

Answer 96

A

The RA3 node type is required. Allows you to share live data.

Answer 97

A

It accelerates the processing of data from S3. Only available on RA3 types.

Answer 98

A

GRANT or REVOKE

Answer 99

A

No, You must use RedShift Serverless.

Answer 100

A

Ad-hoc business analysis and lower environments.

Answer 101

A

By RPU + Storage fee. RPU = Redshift Processing Units.

Answer 102

A

It stores the results of the query. Not the query itself like a view. Good for performance.

Answer 103

A

Yes using RedShift data Sharing

Answer 104

A

You can use Lambda inside your SQL queries.

Answer 105

A

CREATE EXTERNAL FUNCTION

Answer 106

A

It ties Redshift into RDS and Aurora for PostgreSQL and MySQL. Allows access to live data and removes the ETL process. THIS IS READ ONLY.

Answer 107

A

No. This is one way.