DYNAMO DB Flashcards
Dynamo DB Structure
Consists of Tables
Table: Primary Key and Sort Key with Values
Table has infinite number of rows
Each value added is called: Attribute
Max Size of Attribute/Item is 400KB
Data Types
Scalar Types: String, Number, Binary, Boolean, Null
Documents: list, Map
Set Types: String Set, Number Set, Binary Set
DynamoDB: strongly Consistent/ Eventually Consistent
Eventually Consistent:
Reading after write will not always show write data. Takes time for replication
Strongly Consistent: Read after write gives right data
DEFAULT:
Always Eventually Consistant for:
GetItem, Query, Scan provide
ConsistentRead: Parameter can be set to true to allow strongly consistent read. RCU high.
DynamoDB: RCU
One Read Capacity unit:
One Strong Consistent Read
Two Eventually Consistent Read
4KB per RCU
(KB total / 4 per) * (Time/seconds)
DynamoDB Partitions
Amazon DynamoDB stores data in partitions. A partition is an allocation of storage for a table, backed by solid state drives (SSDs) and automatically replicated across multiple Availability Zones within an AWS Region. Partition management is handled entirely by DynamoDB—you never have to manage partitions yourself.
Partition Key goes through Hashing Algorithm to know which partition they go to
CALCULATE TOTAL:
Capacity: (Total RCU / 3000) + (Total WCU / 1000)
Size: Total Size / 10GB
Total Partitions: Celling(Max(capacity, size)
EXAM: WCU RCU Spread WCU AND RCU EVENLY BETWEEN PARTITIONS
100WCU 100RCU 10 Partitions, each gets 10.
DynamoDB: Throttling
ProvisionedThroughputExceededExceptions
Reasons:
Hot Keys: One Partition Key is being read too many times
Popular items
Hot Partitions: popular items in one partition
Large Items: RCU and WCU depend on size too
Solutions:
Exponential Backoff, in SDK
Distribute Partition Key as much as possible, avoid hot parititions
RCU issues can be resolved by DAX
DynamoDB Writing Data:
Put Item
Update Item
Conditional Writes
Put Item: Write to DB, Create or full replace, WCU consumed
Update Item: Update to Data (Partial Update of attributes) Possible to use Atomic Counters and Increase them
Conditional Writes:
Accept a write / Update only if conditions are respected
Helps concurrent access to items
No performance impact
how to write when have issues with Concurrency: Multiple writes happening at same item at same time
DynamoDB delete Data:
DeleteItem
Delete Table
DeleteItem:
Individual Row
Conditional Delete also
Delete Table
Dont for Whole Table, made for speed
Delete table and all items
DynamoDB Batch Writes
BatchWriteItem
BatchWriteItem
Up to 25 PutItem and or DeleteItem in one call total
Up to 16MB of Data written
400KB of data per item
Batch allows:
lower latency, reduce API calls
Operations can be done parallel for efficiency
Part of batch can fail and be retried. (exponential back off)
DynamoDB READ Data
GetItem
BatchGetItem
GetItem Read Based on Primary Key Primary Key: Hash or Hash Range Eventually Consistent by default Option for strong Consistent, more RCU **ProjectionExpression: Specified to include specific attributes**
BatchGetItem
up to 100 Items
Up to 16MB of Data
Items retrieved in paralell to min latency, less api call.
DynamoDB QUERY
Query:
Returns based on
Partition Key Value (Must be Exact, =)
Sort Key Value (=,=>,<=,>,
DynamoDB SCAN / Parallel Scan
- Scan
Scan entire table and then filter out data (inefficient)
Returns up to 1 MB of data, Pagination to keep reading
Consumes ALOT OF RCU
Limit impact
- Use Limit command , reduce return number
- Reduce Size of Scan
Speedy method:
2. Paralell Scans
Multiple Instances, multiple partitions can be used at same time Increases throughput and RCU consumed Limit impact of parallel scans with: -Limit comamnd -and reduced size
- ProjectionExpression + FilterExpression can be used to get specific items. NO CHANGE IN RCU
* ProjectionExpression: Specified to include specific attributes**
* Filterexpression: Further client side filtering*
LSI Local Secondary Index
Must be specified at creation of table
Alternate Range Key/ Sort key , local to the hash key
Sort Key: String /Number/ Binary
Consists of Exactly one scalar attribute
GSI Global Secondary Index
new table with original main key.
Speed up Queries of Non-key Attributes
GSI= Partition Key + Optional Sort Key
Index is new table:
- partition key and sort key of original table are always projected (Keys_only)
- Specify Extra attributes to project (include)
- Use all attributes from main table (ALL)
Must define RCU /WCU for this index table
POSSIBLE TO ADD AND MODIFY GSI, UNLIKE LSI
GSI LSI Throttle
GSI: WRITE ISSUES
- When a GSI has insufficient read capacity, the base table isn’t affected.
- When a GSI has insufficient write capacity, write operations won’t succeed on the base table or any of its GSIs.
Be sure that the provisioned write capacity for each GSI is equal to or greater than the provisioned write capacity of the base table. To modify the provisioned throughput of a GSI, use the UpdateTable operation. If automatic scaling is enabled on the base table, it’s a best practice to apply the same settings to the GSI. You can do this by choosing Apply same settings to global secondary indexes in the DynamoDB console. For more information, see Enabling DynamoDB Auto Scaling on Existing Tables.
Be sure that the GSI’s partition key distributes read and write operations as evenly as possible across partitions. This helps prevent hot partitions, which can lead to throttling. For more information, see Designing Partition Keys to Distribute Your Workload Evenly.
Writes on GSI throttle, then main table will be throttle Even if WCU on main are fine GSI partition needs to be chosen well Assign WCU capacity carefully GSI affects main table***
LSI:
Uses WCU and RSU of main table
No special throttle considerations
WCU RCU LIMIT PER PARTITION
Each partition on a DynamoDB table is subject to a hard limit of 1,000 write capacity units and 3,000 read capacity units. If your workload is unevenly distributed across partitions, or if the workload relies on short periods of time with high usage (a burst of read or write activity), the table might be throttled.
DynamoDB adaptive capacity automatically boosts throughput capacity to high-traffic partitions. However, each partition is still subject to the hard limit. This means that adaptive capacity can’t solve larger issues with your table or partition design. To avoid hot partitions and throttling, optimize your table and partition structure.
Resolution
Before implementing one of the following solutions, use Amazon CloudWatch Contributor Insights to find the most accessed and throttled items in your table. Then, use the solutions that best fit your use case to resolve throttling.
Distribute read and write operations as evenly as possible across your table. A hot partition can degrade the overall performance of your table. For more information, see Designing Partition Keys to Distribute Your Workload Evenly.
Implement a caching solution. If your workload is mostly read access to static data, then query results can be delivered much faster if the data is in a well‑designed cache rather than in a database. DynamoDB Accelerator (DAX) is a caching service that offers fast in‑memory performance for your application. You can also use Amazon ElastiCache.
Implement error retries and exponential backoff. Exponential backoff can improve an application’s reliability by using progressively longer waits between retries. If you’re using an AWS SDK, this logic is built‑in. If you’re not using an AWS SDK, consider manually implementing exponential backoff. For more information, see Error Retries and Exponential Backoff in AWS.