DynamoDB Flashcards
What is DynamoDB?
is a key-value and document database (NoSQL) that delivers single-digit millisecond performance at any scale. It’s a fully managed, multiregion, multimaster, durable database with built-in security, backup and restore, and in-memory caching for internet-scale applications
What uses DynamoDB for security administration and auth?
IAM
Is DynamoDB replicated?
across 3 AZs
What are DynamoDB features?
- Scales to massive workloads, distributed database
- Millions of requests per seconds, trillions of row, 100s of TB of storage
- Fast and consistent in performance (low latency on retrieval)
- Enables event driven programming with DynamoDB Streams
- Low cost and auto scaling capabilities
What is made off DynamoDB?
tables
What is better to use DynamoDB? read or write?
to write: RCU cost = 5 x WCU cost
What must have DynamoDB provisioned mode tables defined for throughput?
Read Capacity Units (RCU) and Write Capacity Units (WCU)
What is a DynamoDB Read Capacity Unit?
- 1 RCU = 1 strongly consistent read of 4 KB per second
* 1 RCU = 2 eventually consistent read of 4 KB per second
What is a DynamoDB Write Capacity Unit?
• 1 WCU = 1 write of 1 KB per second
What you need to have if DynamoDB throughput must be temporarily exceeded?
burst credit
What happens If your DynamoDB burst credit is empty
you’ll get a “ProvisionedThroughputException”
Can you set auto scaling DynamoDB RCU and WCU
yes
What are the read/write capacity modes of DynamoDB?
Provisioned (Free tier)
On demand
What is DynamoDB Streams?
To capture and process changes to DynamoDB items on a table (Create, Update, Delete) on Streams
What can read DynamoDB Streams?
Lambda
What are good scenarios for using DynamoDB Streams?
• React to changes in real time (welcome email to new users) • Analytics • Create derivative tables / views • Insert into ElasticSearch
What can you use to implement CRR on DynamoDB?
DynamoDB Streams
What is the retention time for data on DynamoDB Streams?
1 day
Does DynamoDB accept Transactions?
Yes
Can you access DynamoDB without internet?
Yes, through VPC endpoints
Can you encrypt data in DynamoDB?
yes, at rest (KMS) and in transit (SSL / TLS)
Can you restore tables using DynamoDB?
yes, it has Point In Time Restore and it does not affect performance
What is Amazon DMS?
AWS Database Migration Service (AWS DMS) is a cloud service that makes it easy to migrate relational databases, data warehouses, NoSQL databases, and other types of data stores
What is DynamoDB Global Tables?
provide a fully managed solution for deploying a multiregion (CRR), multi-master database, without having to build and maintain your own replication solution
What you must enable to DynamoDB Global Tables to work?
DynamoDB Streams
How can you build applications that react to data modifications in DynamoDB tables?
Amazon DynamoDB is integrated with AWS Lambda so that you can create triggers—pieces of code that automatically respond to events in DynamoDB Streams.
What is a NoSQL database?
NoSQL databases are non-relational, distributed databases that scale horizontally
What is composed of DynamoDB tables?
- Each table has a primary key (must be decided at creation time)
- Each table can have an infinite number of rows
- Each row has attributes (can be added over time – can be null)
What is DynamoDB item max size?
The maximum size of an item is 400KB
What are two options to define a DynamoDB primary key?
- Option 1: Partition key only (HASH)
* Option 2: Partition key + Sort Key (optional)
What conditions must be accomplished by a DynamoDB Partition Key only primary key?
o Partition key must be unique for each item
o Partition key must be “diverse” so that the data is distributed
o Example: user_id for a users table
What conditions must be accomplished by a DynamoDB Partition Key + Sort Key primary key?
o The combination must be unique o Data is grouped by partition key o Sort key is also known as range key o Example: users-games table user_id for the partition key game_id for the sort key
What is advised to do if you get a DynamoDB ProvisionedThroughputExceptions?
o Exponential back-off when exception is encountered (already in SDK)
o Distribute partition keys as much as possible
o If RCU issue, we can use DynamoDB Accelerator (DAX)
Solve these DynamoDB WCU examples:
• Example 1: we write 10 objects per seconds of 2 KB each.
• Example 2: we write 6 objects per second of 4.5 KB each
• Example 3: we write 120 objects per minute of 2 KB each
o We need 2 * 10 = 20 WCU
o We need 6 * 5 = 30 WCU (4.5 gets rounded to the upper KB)
o We need 120 / 60 * 2 = 4 WCU
What is DynamoDB Eventually Consistent Reads?
If we read just after a write, it’s possible we’ll get unexpected response because of replication
What is DynamoDB Strongly Consistent Reads?
If we read just after a write, we will get the correct data
What Consistent Read option is used by default on DynamoDB?
By default DynamoDB uses Eventually Consistent Reads, but GetItem, Query & Scan provide a “ConsistentRead” parameter you can set to True
Solve these DynamoDB RCU examples:
• Example 1: 10 strongly consistent reads per seconds of 4 KB each
• Example 2: 16 eventually consistent reads per seconds of 12 KB each
• Example 3: 10 strongly consistent reads per seconds of 6 KB each
o We need 10 * 4 KB / 4 KB = 10 RCU
o We need (16 / 2) * ( 12 / 4 ) = 24 RCU
o We need 10 * 8 KB / 4 = 20 RCU (we have to round up 6 KB to 8 KB)
How are DynamoDB WCU and RCU spread?
WCU and RCU are spread evenly between partitions (if you have 100 WCU and 10 partitions, you will end having 10 WCU per partition)
How is data allocated in DynamoDB?
Data is divided in partitions. Partition keys go through a hashing algorithm to know to which partition they go to
What are 3 reasons to get DynamoDB ProvisionedThroughputExceededExceptions?
- Hot keys: one partition key is being read too many times (popular item for ex)
- Hot partitions:
- Very large items: remember RCU and WCU depends on size of items
What is DynamoDB PutItem API?
Write data to DynamoDB (create data or full replace)
o Consumes WCU
What is DynamoDB UpdateItem API?
Update data in DynamoDB (partial update of attributes)
o Does not consume WCU
o Possibility to use Atomic Counters and increase them
What are DynamoDB Conditionals Writes?
o Accept a write / update / delete only if conditions are respected, otherwise reject
What is the impact of DynamoDB Conditional Writes in performance?
no impact
What helps with DynamoDB Conditional Writes?
Helps with concurrent access to items
What is DynamoDB DeleteItem API?
Delete an individual row
What is DynamoDB DeleteTable API?
o Delete a whole table and all its items
o Much quicker deletion than calling DeleteItem on all items
What are DynamoDB BatchWriteItem API limits?
o Up to 25 PutItem and / or DeleteItem in one call
o Up to 16 MB of data written
o Up to 400 KB of data per item
How does improve performance DynamoDB BatchWriteAPI?
- Batching allows you to save in latency by reducing the number of API calls done against DynamoDB
- Operations are done in parallel for better efficiency
What is DynamoDB GetItem API?
o Read based on Primary key
o Primary Key = HASH or HASH-RANGE
What are DynamoDB BatchGetItem API limits?
o Up to 100 items
o Up to 16 MB of data
o Items are retrieved in parallel to minimize latency
What is DynamoDB Query API?
returns items based on: o PartitionKey o SortKey value o FilterExpression Able to do pagination on the results
What particularity is there about a DynamoDB partition key used in a Query API call?
it must use = operator
What particularity is there about a DynamoDB sort key used in a Query API call?
can use several optional operands (=, , >=, Between, Begin)
How much data can you get by using DynamoDB Query API call?
o Up to 1 MB of data
o Or number of items specified in Limit
What can you query on DynamoDB by using Query API?
Can query table, a local secondary index, or a global secondary index
What is DynamoDB Scan API?
Scan the entire table and then filter out data consumes a lot of RCU (inefficient)
How much data can you get by using DynamoDB Scan API call?
Returns up to 1 MB of data – use pagination to keep on reading
What is DynamoDB LSI?
Local Secondary Index is an alternate sort key for your table, local to the hash key
How many DynamoDB LSI can you define?
Up to five local secondary indexes per table.
What regulations does have DynamoDB LSI?
- The attribute that you choose must be a scalar String, Number, or Binary
- LSI must be defined at table creation time
What is DynamoDB GSI?
To speed up queries on non-key attributes, use a Global Secondary Index
• GSI = partition key + optional sort key
What represents a DynamoDB GSI?
The index is a new “table” and we can project attributes on it
o The partition key and sort key of the original table are always projected (KEYS_ONLY)
o Can specify extra attributes to project (INCLUDE)
o Can use all attributes from main table (ALL)
What you must define for DynamoDB GSI indexes?
Must define RCU / WCU for the index
What possibility is provided for DynamoDB GSI that is not for LSI?
Possibility to add / modify GSI (not LSI)
What is DynamoDB in terms of concurrency?
An optimistic locking / concurrency database. That means that you can ensure an item hasn’t changed before altering it
What code changes do you need to do to use DynamoDB DAX?
no application re-write
What happens to writes and reads in DynamoDB DAX?
- Writes go through DAX to DynamoDB
* Micro second latency for cached reads & queries
What is the TTL cache value by default for DynamoDB DAX?
5 minutes
How many nodes can you set when using DynamoDB DAX?
Up to 10, 3 recommendend for Multi-AZ
What is a good case scenario to integrate DynamoDB DAX + ElastiCache?
- DAX stores Individual Objects, Query, Scan cache
- ElastiCache stores Aggregation Results
What can be used for DynamoDB streams?
o React to changes in real time (welcome email to new users) o Analytics o Create derivative tables / views o Insert into ElasticSearch o CRR
What is the data retention period for DynamoDB streams?
1 day
What options do you have in terms of what data is written to a DynamoDB stream?
o KEYS_ONLY — Only the key attributes of the modified item.
o NEW_IMAGE —The entire item, as it appears after it was modified.
o OLD_IMAGE —The entire item, as it appeared before it was modified.
o NEW_AND_OLD_IMAGES — Both the new and the old images of the item.
What is made of DynamoDB streams?
of shards like Kinesis, but you don’t provission shards as you do in Kinesis
What records end up in a DynamoDB stream?
New changes. Records are not retroactively populated in a stream after enabling it
How much WCU and RCU uses deletions in DynamoDB
deletions do not use it
What is DynamoDB TTL?
A background task operated by DynamoDB that automatically deletes an item after an expiry date / time
What is DynamoDB TTL applied to?
TTL is enabled per row (you define a TTL column, and add a timestamp)
Deleted items due to TTL are also deleted in GSI / LSI
When does DynamoDB delete an expired record due to the TTL?
DynamoDB typically deletes expired items within 48 hours of expiration
What can help recover expired items due to DynamoDB TTL?
DynamoDB streams
What is –projection-expression CLI option used for in DynamoDB?
you specify attributes to get from the table:
–projection-expression “user_id, user_name”
What is –filter-expression CLI option used for in DynamoDB?
you can filter the rows to get from the table:
–filter-expression “user_id = :u” –expression-attribute-values ‘{ “:u”: {“S”:”userID123”}}’
What is –page-size CLI option used for in DynamoDB?
you get all data but behind scenes API CALLS = TOTAL_ROWS / PAGE_SIZE are executed. This is hepful to avoid timeouts:
–page-size 1
What is –max-items CLI option used for in DynamoDB?
you get just the amount of data specified in MAX_ITEMS but behind scenes just 1 API CALL is executed, you can use the NEXT_TOKEN to continue retrieving data. Useful for pagination:
–max-items 5
What is –starting-token CLI option used for in DynamoDB?
used in conjunction with max-items to keep on reading new max_items rows, no new API calls. Useful for pagination:
–starting-token wNDWJ4rem3FUNWwefewmf2
What new Write and Read mode was added to DynamoDB on 2018?
Transactional mode
How much WCU and RCU is consumed by DynamoDB Transactional mode?
2x
What is DynamoDB Write Sharding concept?
• Imagine we have a voting application with two candidates, candidate A and candidate B.
• If we use a partition key of candidate_id, we will run into partitions issues, as we only have two partitions
• Solution: add a suffix (usually random suffix, sometimes calculated suffix)
Candidate_A-1
Candidate_A-1
Candidate_A-2
Candidate_B-1
…
What is DynamoDB Conditional Write?
If you have concurrent operations they should use a conditional value in the query; that guarantees the first operation is accepted and the second fails
What is DynamoDB Atomic Write?
If you have concurrent operations increasing same item value both writes will succeed and the item value will be increased twice
What pattern can you use in DynamoDB to store large objects?
Send object to S3 and store small metadata in DynamoDB