Databases Flashcards

1
Q

Does DynamoDB have query join capability?

A

No

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
1
Q

Does DynamoDB support aggregations such as SUM or AVG?

A

No

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How do NoSQL databases scale?

A

Horizontally

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Instead of rows and columns, what is the lexicon used in Dynamo DB?

A

Items = Rows
Attributes = columns

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the maximum size for an item in DynamoDB?

A

400KB

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Using DynamoDB, what is the accepted solution to store BLOB data?

A

Store the object in S3 and store the metadata

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is provisioned mode in DynamoDB?

A

You specify the number of reads and writes and pay for the provisioned capacity regardless if it is all used.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Can you auto-scale using DynamoDB provisioned capacity mode?

A

No. You must use on-demand mode.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is cheaper, DynamoDB provisioned or on-Demand mode?

A

Provisioned mode.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What can be used in DynamoDB provisioned capacity to temporarily exceed throughput?

A

Burst capacity. Once exceeded, a ProvisionedThroughputExceededException will be thrown.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

When reaching a ProvisionedThroughputExceededException error in DynamoDB, what is the prferred solution?

A

An exponential backoff retry

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

In DynamodDB, how large is 1 WCU?

A

1 WCU = 1 write up to 1KB in size per second.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the two types of reads in DynamoDb?

A

Strongly consistent and eventually consistent reads.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is an eventually consistent read in DynamoDB?

A

A read that has the potential to get stale data because the write has not replicated to all the servers on the backend yet.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a strongly consistent read in dynamoDB?

A

It returns data only after it has been fully replicated on the backend.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How do you set a strongly consistent read in DynamoDB?

A

Set the consistentRead parameter to TRUE in API calls.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How many RCUs is a strongly consistent read?

A

It is twice the cost of an eventually consistent read.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

In DynamoDB, how large is 1RCU?

A

One strongly consistent read or two eventually consistent reads per second for items up to 4KB in size.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

In DynamoDB, how are WCUs and RCUs spread?

A

evenly amongst all partitions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

In DynamoDB, what is a projectedExpression?

A

It can be specified to only retrieve certain attributes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Can a FilterExpression be used with key attributes in DynamoDB?

A

No, You cannot use this with HASH or Range Attributes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

In DynamoDB, what is the max value of data allowed to be returned in a Query API call?

A

1MB

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

How does the SCAN operation work in DynamoDB?

A

It loads the entire table and then filters out the data you want. This is inefficient.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

How can you improve DynamoDB scan performance if you must use it?

A

Parallel Scans

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Can you retry Items that fail in Batch operations in DynamoDB?

A

Yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

How many PutItem or DeleteItems can be present in 1 batchWriteItem API request

A

25 items or 16MB of data. Still has a 400KB per item maximum

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What is PartiQL in DynamoDB?

A

It is a SQL compatible query language for dynamodDB. It can only handle CRUD. Joins are still not possible.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Are dynamoDB filters performed on the server or client side.

A

Client side.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What is an LSI in Dynamo DB?

A

It uses the same partition key as the base table, but you get an additional sort key.

Up to 5 LSI per table

Must be defined at table creation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

What is an GSI in Dynamo DB?

A

It uses an alternative primary key.

Requires WCU and RCU

Can be added AFTER table creation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

What happens in DynamoDB when the GSI is throttled?

A

The main table will also be throttled.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

When it comes to WCU and RCU with LSI and GSI, what is the big difference?

A

GSI has provisioned capacity

LSI uses the RCU AND WCU of the main table

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

What is DynamoDB DAX?

A

Seamless in-memory cache for DynamoDB.. Solves HOT key problem.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

How many DynamoDB DAX nodes can be in a cluster?

A

10

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

How many AZs should you use for production?

A

A minimum of three

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

When would you use Elasticache instead of DAX with DynamoDB?

A

When you have an aggregate cached result.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

What is DynamoDB Streams?

A

It is an ordered stream of item-level modifications.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

What AWS services can consume a DynamoDB stream?

A

Lambda, Kinesis Data Streams, Kinesis Client Library

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

What is the max retention for DynamoDB Streams?

A

24 hours

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

What are the DynamoDB stream types?

A

KEYS_ONLY - Key attributes of the modified item

NEW_IMAGE - Entire Item as it appears after it was modified

OLD_IMAGE -Entire Item as it appears before it was modified

NEW_AND_OLD_IMAGES - Both new and old images of the item

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

What is the max time an expired item is held in DynamoDB?

A

48 hours

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

Does DynamoDB have backup and restore capability?

A

Yes. It has PITR like RDS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

What are DynamoDB Global Tables?

A

Multi-Region, Multi-Active, fully replicated tables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

What is DynamoDB Local?

A

Allows you to develop and test apps locally without accessing the web service.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

Does DynamoDB support Federated Logins?

A

Yes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

Is RDS ACID compliant?

A

Yes. All database services are compliant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

What is the maximum database volume size in Aurora?

A

128TB

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

What is the maximum amount of read replicas in Aurora?

A

15

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

Can Aurora backup to S3?

A

Yes, continuous backup to S3 is available.

49
Q

Using Aurora, if you want automatic scaling; what version must you use for automatic scaling?

A

Aurora Serverless

50
Q

What are the two types of LOCKS in RDS?

A

Shared locks - Allows reads and prevents writes

Exclusive Locks: prevent all reads and writes to a resource.`

51
Q

What should the TTL on your DB instance DNS be to support failover?

A

30 seconds or less.

52
Q

What is DocumentDB?

A

It is a NoSQL database similar to MongoDB. JSON Based.

53
Q

In what increments does DocumentDB grow?

A

10GB Increments

54
Q

What is Amazon MemoryDB for Redis?

A

It is a Redis compatible, durable, in-memory database service.

55
Q

What is Amazon Keyspaces?

A

Managed Apache Cassandra NoSQL distributed database.

56
Q

What language do you use to query Amazon Keyspaces?

A

CQL - Cassandra Query Language

57
Q

What is Amazon Neptune?

A

A fully managed graph database.

58
Q

Is Redshift geared for OLAP or OLTP?

A

OLAP

59
Q

What connection types does Redshift support?

A

ODBC, JDBC

60
Q

What are the types of nodes used in a Redshift Cluster?

A

Leader node and compute nodes. Leader nodes create execution plans and delegate the work to the compute nodes.

61
Q

What is the maximum amount of compute nodes you can have in a Redshift Cluster?

A

128 compute nodes

62
Q

Does each compute node in Redshift have its’ own compute, memory, and storage?

A

Yes. This is dependent on the type you choose though.

63
Q

What node type would you use if your want to optimize for storage capacity in Redshift?

A

Dense Storage nodes. These use HDD volumes

64
Q

What node type would you use if your want to optimize for compute in Redshift?

A

Dense compute nodes. These use SSD volumes

65
Q

What sizes do Redshift nodes come in?

A

xlarge or 8xlarge

66
Q

In Redshift, what are compute nodes divided into?

A

Node slices, these use a portion of the resources assigned to the compute node to perform a task.

67
Q

What is Redshift Spectrum?

A

It allows you to query data in S3 (datalake) and allows you to join to your redshift tables.

68
Q

What compression does Redshift support?

A

Gzip and Snappy

69
Q

Why is RedShift so performant?

A

It uses MPP, massive parallel processing, columnar data storage, and column compression.

70
Q

Where does Redshift backup to?

A

It replicates in the cluster, has automated snapshots, and replicates to S3

71
Q

What happens to failed nodes in RedShift?

A

They are automatically replaced.

72
Q

Do single node clusters support replication in Redshift?

A

No. There is nowhere to replicate to.

73
Q

If you want multi-az Redshift support, what cluster type must you use?

A

You must use an RA3 cluster.

74
Q

How does Redshift scale?

A

Vertically and horizontally. A new cluster is created while your old one is available for reads. The CNAME is flipped to the new cluster and data is moved in parallel to new compute nodes.

75
Q

What are the Redshift Node Distribution types?

A

Auto - Based on the size of your data
Even - Distributed across slices in round-robin
Key - Rows are distributed based on a single column
All - The entire table is copied to every node

76
Q

What are Sort Keys in RedShift?

A

They are like indexes and make for fast range queries.

77
Q

What is a single sort key in RedShift?

A

A single column to sort the data

78
Q

What is a compound sort key in RedShift?

A

Made of all columns in the sort key definition. Order is important. Default sort type.

79
Q

What is a interleaved sort key in RedShift?

A

Gives equal weight to each sort key in the list.

80
Q

What is the most efficient way to get data into RedShift?

A

use the COPY command. When using S3 it will need a manifest file.

81
Q

When you want to move a RedShift table into S3, what command would you use?

A

The UNLOAD command.

82
Q

What does enhanced VPC routing do in RedShift

A

When it is enabled, traffic uses the AWS backbone. When it is not enabled, it routes through the internet.

83
Q

What is auto-copy in RedShift?

A

It automatically copies files loaded into S3 into RedShift

84
Q

What is Aurora zero-etl?

A

It automatically replicates from Aurora to Redshift.

85
Q

What is Redshift Streaming Ingestion?

A

It loads data from a Kinesis Data Stream or MSK

86
Q

When you want to copy data that is already in Redshift, what command should you use?

A

INSERT INTO or CREATE TABLE AS

87
Q

Can the COPY command decrypt data?

A

Yes, it can do this as it is being loaded into RedShift.

88
Q

What is the best practice when loading data that has few columns, but lots of rows (narrow table).

A

Load it using a single COPY command. Do not break this up.

89
Q

How do you copy a Redshift snapshot from region to region when it is encrypted?

A

You use a copy grant. You create a KMS key, procide a unique namem and specify the KMS ID in the destination region. You enable copying in the source region.

90
Q

What is Redshift DBLINK?

A

It connects Redshift to PostgreSQL

91
Q

Can you import data from DynamoDB to Redshift?

A

Yes. You can load tables using COPY

92
Q

Does Redshift work with Data Pipeline?

A

Yes

93
Q

What is Redshift WLM?

A

It prioritizes short fast queries vs. long and slow queries.

94
Q

What is Redshift Concurrency scaling?

A

It automatically adds cluster capacity to handle increase n concurrent read queries.

95
Q

What queries in redshift get the most and least concurrency.

A

short queries get more concurrency and long queries get little. This is configurable though and concurrency scaling can be used.

96
Q

What are Redshift query monitoring rules?

A

They monitor your queries and will abort them if they are running longer than allowed.

97
Q

Can Redshift Automatic WLM support Query hopping?

A

No. This must be done in manual mode. Hopping means that you are sending the query to a different queue because it timed out.

98
Q

What is RedShift SQA

A

Short Query Acceleration. This can be used in place of WLM if you only want to accelerate short queries. Can be used with Create Table As and read only queries. You configure the value of what “short” is.

99
Q

What does Vacuum commands do?

A

Recovers space from deleted rows

100
Q

What are the Vacuum types

A

FULL - Default. Resorts and reclaims space
DELETE ONLY - Only reclaims space
SORT ONLY - Only resorts records
REINDEX - Re-Analyze interleaved

101
Q

What is Elastic Resize in Redshift?

A

It allows you to add or remove nodes of the same type.

The cluster goes down for a few min, but it tries to keep connections open.

102
Q

What is classic resize in RedShift?

A

Allows you to change node types or number of nodes

103
Q

What can you do to minimize downtime during resizing your cluster

A

Snapshot, restore, and then resize before changing the primary.

104
Q

What is Redshift data lake export?

A

It UNLOADs your query to S3 in Apache Parquet format. 2x faster and 6x more compressed

105
Q

What can I use in RedShift to UNLOAD data to EMR or SageMaker?

A

Redshift Data Lake Export

106
Q

What are RA3 Nodes in RedShift?

A

It has managed storage. SSD based. Can be independently scaled.

107
Q

What are the spatial data types in RedShift?

A

Geometry and Geography

108
Q

What is required to use cross-region replication in RedShift?

A

The RA3 node type is required. Allows you to share live data.

109
Q

What is AQUA

A

It accelerates the processing of data from S3. Only available on RA3 types.

110
Q

What commands are used to define access privileges for a user or group in RedShift?

A

GRANT or REVOKE

111
Q

Does Redshift automatically scale?

A

No, You must use RedShift Serverless.

112
Q

What is a good use case for Redshift serverless?

A

Ad-hoc business analysis and lower environments.

113
Q

How is RedShift serverless billed?

A

By RPU + Storage fee. RPU = Redshift Processing Units.

114
Q

Does Redshift serverless support spectrum?

A

No

115
Q

What are materialized views in RedShift?

A

It stores the results of the query. Not the query itself like a view. Good for performance.

116
Q

Can you share data across redshift clusters for read purposes?

A

Yes using RedShift data Sharing

117
Q

What is a Redshift Lambda UDF?

A

You can use Lambda inside your SQL queries.

118
Q

How do you define a Redshift Lambda UDF

A

CREATE EXTERNAL FUNCTION

119
Q

What are RedShift Federated Queries?

A

It ties Redshift into RDS and Aurora for PostgreSQL and MySQL. Allows access to live data and removes the ETL process. THIS IS READ ONLY.

120
Q

Using RedShift Federated Queries, can you use RDS to query RedShift?

A

No. This is one way.

121
Q
A