DynamoDB (ACG) Flashcards

1
Q

What is DynamoDB?

A

DynamoDB is a low latency NoSQL database

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What data models does DynamoDB support?

A

Supports both document and key-value data models. Supported document formats are JSON, HTML, and XML.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the different consistency models in DynamoDB?

A
  • Eventually consistent
  • Strongly consistent
  • DynamoDB transactions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Name three DynamoDB features

A

DynamoDB consists of:

  1. tables
  2. items
  3. attributes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the 2 types of primary keys in DynamoDB?

A
  1. Partition Key

2. Composite Key (partition key + sort key)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a HASH key?

A

The partition/primary key was formerly known as a hash key.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How do you control access to DynamoDB?

A

IAM (Identity Access Management).

You can create IAM users within your AWS account with specific permissions to access and create DynamoDB tables.

You can also create IAM roles, enabling temporary access.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How do you restrict user access to DynamoDB?

A

You can use a special IAM condition parameter “dynamodb:LeadingKeys to restrict user access to only the items where the partition key matches their User_ID.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a secondary index?

A

An secondary index allows you to perform more flexible querying within DynamoDB.

Queries can be run on non-primary key attributes using global secondary indexes and local secondary indexes.

A secondary index allows you to perform fast queries on specific columns in a table, rather than on the entire dataset.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

A local secondary index can only be created when you are creating your table. True or False?

A

TRUE

It also has the same partition key as your primary table, but a different sort key.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

A global secondary index can only be created when you create your table. True or False?

A

FALSE

A global secondary index can be created at any time.

Uses a different primary key and different sort key to your table.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Should you use a Query or Scan?

A

Avoid scans!

Queries are much more efficient than a scan. A scan dumps the entire table and filters out the values. Scans can easily eat up all your provisioned throughput as the table grows.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

A query finds items in a table using only the primary key attribute. You provide the primary key name and a distinct value to search for. True or False?

A

TRUE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Query results are always sorted by the sort key in ascending order if there is one.

A

TRUE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

For queries and scans, what parameter is used to refine results?

A

ProjectionExpression parameter

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What parameter can you use to reverse the sort order?

A

Set ScanIndexForward parameter to false to reverse the order of query results.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

How can you reduce the impact of a query or scan?

A
  • Setting a smaller page size which uses fewer read operations
  • For scans, isolate scan operations to specific tables and segregate them from your mission-critical traffic
  • Alternatively, try parallel scans rather than the default sequential scans
  • A query operation is generally more efficient than a scan
  • Avoid scans and design tables in a way that you can use the Query, Get or BatchGetItem APIs.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Name commonly used DynamoDB CLI commands

A

Be aware that the user must have the correct IAM permissions to run these commands. CLI commands are making calls to a DynamoDB API, e.g. CreateTable, PutItem, etc.

create-table
put-item
get-item
update-item
update-table
list-tables
describe-table
scan
query
delete-item
delete-table
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is DynamoDB provisioned throughput measured in?

A

Capacity units:
1 write capacity unit = 1KB write per second

1 read capacity unit =
1 x strongly consistent read of 4KB per second
OR
2 x eventually consistent read of 4KB per second

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Your app needs to read 80 items per second. Each item is 3KB in size. You need strongly consistent reads. How many read capacity units will you need?

A

1 read capacity unit = 4KB strongly consistent read per second

3KB / 4KB = 0.75 = round-up to whole number 1
1 x 80 reads items per seconds = 80 read capacity units

For eventually consistent reads:
1 read capacity unit = 2 x 4KB eventually consistent reads per second

3KB / 4KB = 0.75 = round-up to whole number 1
80 read items per second / 2 = 40 read capacity units

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

You want to write 100 items per second. Each item is 512 bytes in size. How many write capacity units will you need?

A

1 write capacity unit = 1KB write per second

512 bytes / 1024 bytes (1KB) = 0.5 rounded to nearest whole = 1 write capacity unit

1 x 100 write items per second = 100 write capacity units

22
Q

When would you use DynamoDB On-Demand Capacity?

A
  • Unpredictable application traffic

- Pay-per-use model

23
Q

When would you use DynamoDB Provisioned Capacity?

A
  • Read and write capacity requirements can be forecasted

- Application traffic is consistent or increases gradually

24
Q

What is DAX?

A

DynamoDB Accelerator (DAX) is a fully managed, clustered in-memory cache for DynamoDB.

Delivers up to a 10x READ only performance improvement. Microsecond performance for millions of requests per second.

25
Q

DAX is ideal for which types of apps?

A

Read-heavy and bursty workloads like auctions, gaming, and retail sites during Black Friday promotions.

26
Q

DAX is a write-through caching service. What does this mean?

A

Data is written to the cache and to the backend store at the same time.

27
Q

When is DAX not suitable?

A

Not suitable for apps that require strongly consistent reads, and are mainly write intensive.

Caters for eventually consistent reads only.

28
Q

To use DAX, where do you point your API calls?

A

Point your API calls to your DAX cluster instead of your table. If the item is in the cache it will return it, otherwise it will perform a eventually consistent GetItem operation to your DynamoDB table.

29
Q

What is DynamoDB TTL?

A

Time To Live (TTL)
Defines an expiry time for your data. Once expired, an item is marked for deletion.

Great for removing irrelevant or old data, e.g. session data, event logs, temporary data.

Helps save you money by reducing the cost of your table by automatically removing data.

30
Q

What can you do with DynamoDB Streams?

A

Every DynamoDB action is recorded in an encrypted log stored for 24 hours (create/update/delete). You can trigger a lambda function or your application based on a change in the DynamoDB table.

31
Q

What is the type of sequence for DynamoDB streams?

A

DynamoDB Streams is a time-ordered sequence of item level modifications in your DynamoDB tables.

32
Q

How long are DynamoDB Streams stored?

A

Data is encrypted and stored for 24 hours.

33
Q

What happens if your app is making too many requests into the DynamoDB table?

A

ProvisionedThroughputExceededException error

If you are using the AWS SDK, it will automatically retry the requests until successful. AWS SDK uses exponential backoff.

Otherwise, reduce your request frequency in your app. Use exponential backoff.

34
Q

What is exponential backoff?

A

Progressively longer waits between consecutive retries, for improved flow control in the hope traffic dies down to a normal level.

Every AWS SDK features exponential backoff.

35
Q

What is the API call to retrieve multiple items from a DynamoDB table?

A

BatchGetItem

36
Q

Which of the following DynamoDB API calls is used to add a new item into a DynamoDB table?

A

PutItem

37
Q

A user would like to use the AWS CLI to get the attributes for an item with a specific primary key stored in a DynamoDB table. In order to do this, they will need IAM permissions for which of the following?

A

GetItem

38
Q

You have an application that needs to read 25 items per second and each item is 13KB in size. Your application uses strongly consistent reads. What should you set the read throughput to?

A

1 RCU = 1 x 4KB strongly consistent write per second
13KB / 4KB = 3.25 round up to 4
4 x 25 = 100

Answer: 100

39
Q

In terms of performance, a scan is more efficient than a query. True or False?

A

FALSE

40
Q

Your DynamoDB table is throwing a ProvisionedThroughputExceeded error, what could be the problem?

A

Your application’s request rate is too high

41
Q

Your low latency web application needs to store its session state in a scalable way so that it can be accessed quickly. Which service do you recommend?

A

DynamoDB

42
Q

Your application is storing customer order data in DynamoDB. Which of the following pairs of attributes would make the best composite key to allow you to query DynamoDB efficiently to find a customer order that was placed on a specific day?

A

CustomerID + OrderDate

43
Q

In order to address performance considerations and achieve faster response times, which of the following best practices can be used for Querying and Scanning Data in Amazon DynamoDB large tables?

A

Reduce the page size to return fewer items per results page.

Run parallel scans if the table size is 20 GB or larger, the table’s provisioned read throughput is not being fully used, and sequential Scan operations are too slow.

44
Q

Which of the following approaches is used in AWS to improve flow control by retrying failed requests using progressively longer waits between retries?

A

Exponential Backoff

45
Q

What is the difference between a Global Secondary Index and a Local Secondary Index

A

You can create a Global Secondary Index at any time but you can only create a Local Secondary Index at table creation time

You can delete a Global Secondary Index at any time

46
Q

You are running a query on your Customers table in DynamoDB, however you only want the query to return CustomerID and EmailAddress for each item in the table, how can you refine the query so that it only includes the required attributes?

A

Use the ProjectionExpression parameter

When using Query, or Scan, DynamoDB returns all of the item attributes by default. To get just some, rather than all of the attributes, use a Projection Expression.

47
Q

Your application is storing customer order data in a DynamoDB table. You need to run a query to find all the orders placed by a specific customer in the last month, which attributes would you use in your query?

A

The Partition Key of CustomerID and a Sort Key of OrderDate

48
Q

You have a motion sensor which writes 600 items of data every minute. Each item consists of 5KB. What should you set the write throughput to?

A

1 WCU = 1KB write per second
1KB x 5KB = 5KB
600 writes per min / 60 sec = 10 write items per sec
10 x 5 = 50

Answer: 50

49
Q

By default, a DynamoDB query operation is used for which of the following?

A

Find items in a table based on the partition key attribute

50
Q

DynamoDB is a No-SQL database provided by AWS. T/F?

A

TRUE

51
Q

Which feature can you use to capture a time-ordered sequence of all the activity which occurs on your DynamoDB table - e.g. insert, update, delete?

A

DynamoDB Streams