DynamoDB Flashcards
What is DynamoBD
- DynamoDB is a fully Managed NoSQL database
- It is highly available with replication across 3 AZs
- Low cost and autoscaling capabilities
- Must have a provisioned read and write capacity units
- It is an optimistic locking/concurrency database
How is NoSQL different then AWS RDS options
- A NoSQL database in non-relational
- You procure table(s) not a database
- Each table must have a primary key
- Each table can have an infinite # of items (aka rows), but size is limited to 400kb
What is the maximum size of an item (aka row)
400kb
What data types are supported in dynamoDb?
- String
- Number
- Binary
- Boolean
- Null
- List
- Map
- String Set
- Number Set
- Binary Set
- Simply put - dynamoDB will support Scaler Values (one value at a time), document values (multiple values of different type), and Sets (Multiple values of the same type).
What are requirements for a Primary Key?
Must be unique
Must be ‘diverse’ so that the data is distributed
What is a sort key?
A dynamoDB primary key can consist of both a primary key and a sort key
The combination must be unique
Sort key == range key
What are Read Capacity Units (RCU)?
throughput for reads
What are Write Capacity Units (WCU)?
throughput for writes
What is the formula for calculating WCUs?
One WCU = 1 write per second for an item up to 1KB in size
Example:
we write 10 objects per seconds of 2 KB each.
We need 10 * 2 = 20 WCU
What is the default read consistency for dynamoDB?
DynamoDB uses Eventually Consistent Reads, but GetItem, Query & Scan provide a “ConsistentRead” parameter you can set to True
What is the formula for calculating dynamoDB RCU’s?
Strongly Consistent Reads
- 1 read per second for an item up to 4 KB in size
Eventually Consistent Reads
- 2 reads per second, for an item up to 4 KB in size.
If the items are larger than 4 KB, more RCU are consumed (applies to both eventually / strongly consistent reads)
EXAMPLE:
10 strongly consistent reads per seconds of 4 KB each We need 10 * 4 KB / 4 KB = 10 RCU
What term is used to describe rows in a dynomoDB table?
Items
What term is used to describe columns in a dynomoDB table?
Attributes
Can throughput be exceeded temporarily in a dynomoDB table?
Yes, by using burst credits
What happens if there are burst credits are empty in a dynomoDB table?
you’ll get a ‘ProvisionedThroughputException’ error
How do partitions affect RCUs and WCUs?
RCUs and WCUs are spread evenly across partitions
What happens if the defined RCU or WCU in dynomoDB is exceeded?
You get a ‘ProvisionedThroughputExceeded’ error
What is the most popular reason to receive a ‘ProvisionThroughputExceededException’ in dynomoDB?
- Hot Keys: one partition key is being read too many times (popular item for ex)
- Hot partitions
- Very Large Items: remember that RCU and WCU depends on size of items
What are some possible solutions to a ‘ProvisionedExceptionError’ in dynamoDB?
- Exponential back-off when exception is encountered (already in SDK)
- Distribute partition keys as much as possible
- If RCU issue, use DynamoDB Accelerator (DAX)
What is the API command to write data to DynamoDB?
- PutItem
- Consumes WCU
What is the API command to update data in DynamoDB?
UpdateItem
What are conditional writes in DynamoDB?
Places a condition on an item as to when it can be updated.
Example:
Item can only be updated if value = x
What is the API call for deleting an item?
DeleteItem
Deletes an Individual row (Item)
Can you conditionally perform a delete in dynamoDB?
Yes
What is the API call to delete a table in dynamoDB?
DeleteTable
Much quicker tehn calling DeleteItem for each item in the table.
What is the benefit of Batch Writing in DynamoDB?
- Batching allows you to save in latency by reduing the number of API calls done against DynamoDB
- Operations are done in parallel for better efficency
What happens if part of a batch fails in dynamoDB?
The API contains the ability to retry failed items by using that exponential back-off algorithm
What is the API call for batching writes in DynamDB? Describe its properties.
- API call = BatchWriteItem
- Up to 25 PutItem and / or DeleteItem in one call
- Up to 16 MB of data written
- Up to 400 KB of data / item
What is the API call for Reading data in DynamoDB? What are its properties?
- API call = GetItem
- Read based on primary key
- Primary Key = HASH or HASH-RANGE
- Eventually consistent read by default
- Option to use strongly consistent reads (more RCU - might take longer)
- ProjectionExpression can be specified to include only certain attributes (similar to JPA call in Java)
What is the API call for batching reads in DynamoDB? What are its properties?
- API call = BatchGetItem
- Up to 100 items
- Up to 16 MB of data
- Items are retrieved in parallel to minimize latency
What are the 2 main ways to get data from a dynamoDB table?
- Query
- Scan
Describe DynamoDB Queries.
- Return items based on
- Partition Key
- Sort Key
- FilterExpression (this happens client side)
- Can query table, a local secondary index or a global secondary index
Describe dynamoDB scans
- The entire table is scanned and then data is filtered out (inefficent)
- Consumes a lot of RCU
How can you increase the performance of dynamoDB scans?
- Use Parallel Scans - BUT this Increases the throughput and RCU consumed
- Can use a ProjectExpression + FilterExpress (no change to RCU though)
What is a local Secondary Index in dynamoDB?
- Alternate range key for the table
- Can have up to 5 local secondary indexes per table
- The sort key consists of exactly one scalar attribute
- The attribute that you choose must be a scalar String, Number, or Binary
- LSI must be defined at table creation time
What is a Global Secondary Index (GSI) in DynamoDB?
- Helps speed up queries on non-key attributes
- GSI = partition key + optional sort key
- Must define RCU / WCU for the index
- You CAN add / modify GSI (unlike LSI)
- The index is a new ‘table’ and you can project attributes on it
(T /F) DynamoDB Indexes can cause throttling
True
For Global Secondary Indexes (GSI) in dynamoDB, what happens if the writes are throttled?
- The main table will be throttled as well - this is even true if you HAVE provisioned enough WCUs for the MAIN table
- So choose GSI carefully and be sure that your GSI has enough WCUs assiged to it. Otherwise, a throttle write on the GSI will = a throttle write on the main table.
Are there any special throttle considerations for Local Secondary Indexes (LSI)?
No. The LSI uses the WCU and RCU of the main table (because its part of the main table unlike GSI that is a seperate table)
What is DynamoDB DAX
- Dax = DynamoDB Accelerator
- Seamless cache for DynamoDB, no application rewrite.
- Writes go through DAX to DynamoDB
- Solves the hot key problem
What is the default TTL for DAX caches?
5 minutes
What is the maximum number of nodes in a dynamoDB - dax custer?
10
What is the difference between DAX and ElastiCache
- DynamoDB is good for individual query and/or scan caches
- ElastiCache is good for aggragated data
What are DynamoDB Streams?
- Changes in DynamoDB (Create, Update, Delete) can end up in a DynamoDB Stream
- Stream can be read by AWS lambda and EC2 instances
- Stream has 24 hours of data retention.
- Records are not retroactivley populated in a stream after enabling it
What types of information can be written to a dynamoDB stream?
- KEYS_ONLY: only key attributes of the modified item
- NEW_IMAGE: The entire item, as it appears after it was modified
- OLD_IMAGE: the entire item, as it appears before it was modified
- NEW_AND_OLD_IMAGES: Both the new and old images of the item
What are dynamoDB streams made of? Do you need to provision them?
- DynamoDB streams are made of shards.
- You do NOT need to provision them, this is automated by AWS.
How does dynamoDB Streams and Lambda work together?
- You need to define an Event Source Mapping to read from a dynamoDB stream
- You need to ensure the Lambda function has approporiate permissions
- Your lambda function is invoked synchronously
What is a dynamoDB transaction?
- Ability to create / update / delete multiple items in different tables at the same time.
- Its an all or nothing type of operation
What is DynamoDB - TTL (time to live)?
- It is an atribute on an item that defines and expiry date / time
- TTL is provided at no extra cost
- it is a background task operated by the DynamoDB service itself.
Do DynamoDB TTL deletions utilize RCAA / WCUs?
No
Why would you use TTL on a dynamoDB item?
- Helps reduce storage and manage the table size overtime
- May help adhere to regulatory norms
How long does it take for DynamoDB ttl items to be deleted once they’ve hit their expiry date?
Item should be deleted within 48 hours
Are DynamoBD TTL items also deleted from GSI /LSI?
Yes
(T/F) DynamoDB Streams ca help recover expired (TTL) items.
True
What is a dynamDB –projection-expression?
A list of attributs to retrieve
What is dynamoDB –filter-expression?
filter results
In what case(s) would you use an dynamoDB transaction?
When data needs to be created/updated/deleted in related tables
Can DynamoDB be used to store session state data?
Yes - probably the best option to do so
What are the four different write types in dynamoDB? Describe each.
- Concurrent Writes - users can write to an item at the same time. Last update wins.
- Conditional Writes - updates/deletes can only occur on an item if the item meets a specific condition
- Atomic Writes - Items can be increased or decreased by a certain values concurrently
- Batch Writes - Write / update many items at once
Do dynamoDB table scale horizontally or vertically?
horizontally
You would like to perform an efficient Query on an attribute that is not part of your table’s primary key. What do you recommend?
Create a Globl Secondary Index (GSI)
Which feature of DynamoDB allows it to achieve Optimistic Locking?
Conditional Writes
What is the difference between a Global Secondary Index (GSI) and Local Secondary Index (LSI) in dynamoDB?
Global secondary index
an index with a partition key and a sort key that can be different from those on the base table
Local secondary index
an index that has the same partition key as the base table, but a different sort key
How can the inefficency of table scans be limited on DynamoDB?
You can limit the impact of table scans by using ‘Limit’ or reduce size the size of the result and pause.
In Dynamo DB, can the primary key be defined or changed after table creation?
No. The primary key must be defined at table creation
What types of primary keys does dynamoDB support?
DynamoDB supports two different kinds of primary keys
Partition key – A simple primary key, composed of one attribute known as the partition key.
Partition key and sort key – Referred to as a composite primary key, this type of key is composed of two attributes. The first attribute is the partition key, and the second attribute is the sort key.
What is another name for a partition key in dynamoDB?
Hash Attribute
What is an alternative name for the sort key in dynamoDB?
range key
What is the data limit returned from a dynamoDB query
Return up to 1 MB of data or # of items specified in Limit
Can the results of a dynamoDB query be paginated?
yes
Are indexed (GSI and LSI) queried in a dynamoDB query?
Yes, they can be queried along with the table.
How many Local Secondary Indexes (LSI) can be added to a dynamoDB table?
Up to 5
What is a Global Table in dynamoDB?
A table that is replicated into multiple regions.
Great for multi-region applications because it reduces latency by giving the user access to the table closest to them.
What are some of the main requirements for global tables?
- The same write capacity is requried.
- All table must have the same name.
- All tables must have the same primary key.
- All tables must be empty.
What happens if an update is made to the same item in a global table at the same time?
The last write wins.
What are some recommendations for choosing a sort key in dynamoDB?
Data can be queried with
- starts-with
- between
- >
- <
Are Global Secondary Indexes replicated as part of a Global Table in dynamoDb?
No. They must be manually added to each region where needed.
For dynamoDb on-demand backup and restore, what gets backed up?
- Local and Globacl Secondary Indices
- DynamoDb Steams
- Provisioned read and write capacity
For dynamoDb on-demand backup and restore, what does not get backed up and restored?
- Autoscaling policies
- Access policies
- CloudWatch metrics and alarms
- Stream settings
- TTL Settings
What is dynamoDb point-in-time recovery? How long is it available?
- Protects from accidental deleteion
- No schedule on-demand back-ups are needed
- Available for 35 days (can not do more or less time)
- Restorable to 5 minutes from the current time
For dynamoDb Point-In time recovery, What gets restored?
- Local and Global secondary indices
- Provisioned Read/Write capacity
- Encryption settings
For dynamoDb Point-In time recovery, What does not get restored?
- Auto Scaling policies
- Access policies
- ClourdWatch meterics and alarams
- Stream Settings
- TTL settings
- Point-in-Time recovery settings
What are the main differences in use cases between on-demand backup and point-in-time recovery for dynamoDb?
On-Demand Backup
- Best for regulatory compliance
- Designed for long-term data archival
Point-In-Time Recovery
- Best for disaster recovery
- Can be used for compliance if data retention time is 35 days or below