Dynamo DB Flashcards
Dynamo DB
- NoSQL serverless - fully managed
- non-relational and distributed - gives you more horizontal scaling than RDS
- all data needed for query is present in one row
Dynamo primary keys
- Partition Key (HASH)
- unique for each item - i.e. userID
- partition key + sort key (HASH + Range)
- data grouped by partition key
- combination is unique for each item
- i.e. user-games table = userid + gameid
- users can attend multiple games
Dynamo rows / items
- each item has attributes, can be null
- max size of 400KB per row
- scalar - string number binary bool null
- etc
Lambda RCU (Read Capacity Units)
- strongly consistent read - set ‘consistentRead’ parameter to TRUE in api calls
- higher latency, twice the RCU
- eventually consistent read (default) - can get stale data
- one RCU = one strongly consistent read per second or two eventually consistent reads per second for an item up to 4kb in size
- i.e. 10 Strongly consistent reads /s with item size 4kb = 10 * 4/4 = 10 RCUs
- 16 eventually consistent reads /s with item size 12kb = 16 / 2 * 12 / 4 = 8 * 3 = 24 RCUs
- 10 strongly consistent reads / s item size 6kb = 10 * 8 / 4 = 20 RCUs (Round up to the nearest 4 KB!)
- 3 item strong read/s with size 6KB = 3 * 8 / 4 = 6
- 5 transactional reads /s with item size 5kb = 5 * 8/4 * 2 = 20 RCUs
Solution to throughput exceeded exception
- burst capacity - can exceed provisioned capacity
- if burst capacity is consumed, you’ll get throughput exceeded exception
- reasons - hot keys, hot partitions, large items
- solution - DAX
- reasons - hot keys, hot partitions, large items
Dynamo write capacity units (WCU)
- one wcu represents one write per second for an item up to 1KB in size (larger items consume more WCUs)
- i.e. 10 items per second w/ item size 2KB = 10 * (2KB/1KB) = 20 WCUs
- 6 items per second with item size 4.5KB = 6 * 5.0 = 30 (Round to upper kilobyte!)
- 120 items per minute w/ item size 2KB = 120 / 60 * 2 = 4 WCUs
- 2 item write / s with size 6kb = 2 * 6 = 12
- 3 transactional writes / s with item size 5kb = 3 * 5/1 * 2 (transactional cost) = 30 WCUs
lambda partitions
- Partitions - copies of tables on servers
- WCUs and RCUs are spread evenly across partitions
Local secondary index (LSI)
- same partition key as base table
- sort key is one scalar attribute
- up to 5 LSI per table DEFINED AT TABLE CREATION
- uses WCU and RCU of main table - no special throttling considerations
lambda global secondary index
- hash or hash+range from base table
- can be created AFTER table creation
- speed up queries that are non-key attributes
- must provision RCUs and WCUs for the new index (like a new table!)
- if writes are throttled on the GSI, then the main table will be throttled
lambda optimistic locking
- ‘conditional writes’
- ensures an item hasn’t changed before you update/delete it (each item has a version attribute)
DynamoDB Accelerator (DAX)
- in-memory cache for dynamo db
- solves hot key problem
- cluster of nodes w/ 5 minute TTL default
- DAX vs Elasticache - can be used in tandum
- dax - cache for individual objects, queries, scans
- elasticache - store aggregation result of what your app queried from dynamo db
dynamodb streams
- ordered stream of item-level modifications in a table
- use case - analytics, welcome emails, derivative tables
- can choose what info is written to the stream
- made of shards just like kinesis data streams - automatically provisioned by aws
- records are not retroactively populated in a stream after enabling it
dynamo global tables
reduce latency for globally distributed users. if you have globally dispersed users, consider using global tables. With global tables, you can specify the AWS Regions where you want the table to be available. This can significantly reduce latency for your users. So, reducing the distance between the client and the DynamoDB endpoint is an important performance fix to be considered.