Database Specialty - DynamoDB Flashcards

Question

DynamoDB Adaptive Capacity

Answer 1

* Total provisioned capacity = 600 WCUs per sec * Provisioned capacity per partition = 200 WCUs per sec * Unused capacity = 200 WCUs per sec * So the hot partition can consume these unused 200 WCUs per sec above its allocated capacity * Consumption beyond this results in throttling * For Non-uniform Workloads * Works automatically and applied in real time * No Guarantees

Answer 2

* Can define up to 5 LSIs * Has same partition/hash key attribute as the primary index of the table * Has different sort/range key than the primary index of the table * Must have a sort/range key (=composite key) * Indexed items must be ≤ 10 GB * Can only be created at the time of creating the table and cannot be deleted later

Answer 3

* Can define up to 20 GSIs (soft limit) * Can have the same or different partition/hash key then the table’s primary index * Can have the same or different sort/range key then the table’s primary index * Can omit sort/range key (=simple and composite) * No size restrictions for indexed items * Can be created or deleted at any time. Can delete only one GSI at a time * Can query across partitions (over the entire table) * Support only eventual consistency * Has its own provisioned throughput * Can only query projected attributes (attributes included in the index)

Answer 4

* When application needs same partition key as the table * When you need to avoid additional costs * When application needs strongly consistent index reads

Answer 5

* When application needs different or same partition key as the table * When application needs finer throughput control * When application only needs eventually consistent index reads

Answer 6

* Uses the WCU and RCU of the main table * No special throttling considerations

Answer 7

* If the writes are throttled on the GSI, then the main table will be throttled! (even if the WCU on the main tables are fine) * Choose your GSI partition key carefully! * Assign your WCU capacity carefully!

Answer 8

* You can model different entity relationships like 1:1, 1:N, N:M * Store players’ game states – 1:1 modeling, 1:N modeling * user_id as PK, game_id as SK (1:N modeling) * Players’ gaming history – 1:N modeling * user_id as PK, game_ts as SK (1:N modeling) * Gaming leaderboard – N:M modeling * GSI with game_id as PK and score as SK

Answer 9

* Imagine we have a voting application with two candidates, candidate A and candidate B. * If we use a partition key of candidate_id, we will run into partitions issues, as we only have two partitions * Solution: add a suffix (usually random suffix, sometimes calculated suffix)

Answer 10

* Common Exceptions * Access Denied Exception * Conditional Check Failed Exception * Item Collection Size Limit Exceeded Exception * Limit Exceeded Exception * Resource In Use Exception * Validation Exception * Provisioned Throughput Exceeded Exception * Error Retries * Exponential Backoff

Answer 11

* Store DynamoDB table data (physically) * Each (physical) partition = 10GB SSD volume * Not to be confused with table’s partition/hash key (which is a logical partition) * One partition can store items with multiple partition keys * A table can have multiple partitions * Number of table partitions depend on its size and provisioned capacity * Managed internally by DynamoDB * Provisioned capacity is evenly distributed across table partitions * Partitions once allocated, cannot be deallocated (important!)

Answer 12

1 partition = 1000 WCUs or 3000 RCUs (Maximum supported throughput per partition) * 1 partition = 10GB of data * No. of Partitions = Either the number of partitions based on throughput or the number of partitions based on size, whichever is higher

Answer 13

* Provisioned Capacity: 500 RCUs and 500 WCUs * Storage requirement < 10 GB * Number of Partitions: PT = ( 500 RCUs/3000 + 500 WCUs/1000) = 0.67 => rounded up => 1 partition * Say, we scale up the provisioned capacity * New Capacity: 1000 RCUs and 1000 WCUs PT = ( 1000 RCUs/3000 + 1000 WCUs/1000) = 1.33 => rounded up => 2 partitions

Answer 14

* You can manually scale up provisioned capacity as and when needed * You can only scale down up to 4 times in a day * Additional one scale down if no scale downs in last 4 hours * Effectively 9 scale downs per day * Scaling affects partition behavior * Any increase in partitions on scale up will not result in decrease on scale down (Important!) * Partitions once allocated will not get deallocated later

Answer 15

* In-Memory Caching, microsecond latency * Sits between DynamoDB and Client Application (acts a proxy) * Saves costs due to reduced read load on DynamoDB * Helps prevent hot partitions * Minimal code changes required to add DAX to your existing DynamoDB app * Supports only eventual consistency (strong consistency requests pass-through to DynamoDB) * Not for write-heavy applications * Runs inside the VPC * Multi AZ (3 nodes minimum recommended for production) * Secure (Encryption at rest with KMS, VPC, IAM, CloudTrail…)

Answer 16

* DAX has two types of caches (internally) * Item Cache * Query Cache * Item cache stores results of index reads (=GetItem and BatchGetItem) * Default TTL of 5 min (specified while creating DAX cluster) * When cache becomes full, older and less popular items get removed * Query cache stores results of Query and Scan operations * Default TTL of 5 min * Updates to the Item cache or to the underlying DynamoDB table do not invalidate the query cache. So, TTL value of the query cache should be chosen accordingly.

Answer 17

* Only for item level operations * Table level operations must be sent directly to DynamoDB * Only for item level operations * Table level operations must be sent directly to DynamoDB * Write Operations use write-through approach * Data is first written to DynamoDB and then to DAX, and write operation is considered as successful only if both writes are successful * You can use write-around approach to bypass DAX, e.g. for writing large amount of data, you can write directly to DynamoDB (Item cache goes out of sync)

Answer 18

* Only for item level operations ] * Table level operations must be sent directly to DynamoDB * Write Operations use write-through approach * Data is first written to DynamoDB and then to DAX, and write operation is considered as successful only if both writes are successful * You can use write -around approach to bypass DAX, e.g. for writing large amount of data, you can write directly to DynamoDB (Item cache goes out of sync) * For reads, if DAX has the data (=Cache hit), it’s simply returned without going through DynamoDB

Answer 19

* To implement DAX, we create a DAX Cluster * DAX Cluster consists of one or more nodes (up to 10 nodes per cluster) * Each node is an instance of DAX * One node is the master node or primary node * Remaining nodes act as read replicas * DAX internally handles load balancing between these nodes * 3 nodes minimum recommended for production

Answer 20

* Automatically encrypted, cataloged and easily discoverable * Highly Scalable - create or retain as many backups for tables of any size * Backup operations complete in seconds * Backups are consistent within seconds across thousands of partitions * No provisioned capacity consumption * Does not affect table performance or availability * Backups are preserved regardless of table deletion

Answer 21

* Can backup within the same AWS region as the table * Restores can be within same region or cross region * Integrated with AWS Backup service (can create periodic backup plans) * Periodic backups can be scheduled using Lambda and CloudWatch triggers * Cannot overwrite an existing table during restore, restores can be done only to a new table (=new name) * To retain the original table name, delete the existing table before running restore * You can use IAM policies for access control

Answer 22

* Restored table gets the same provisioned RCUs/WCUs as the source table, as recorded at the time of backup * PITR RPO = 5 minutes approx. * PITR RTO can be longer as restore operation creates a new table

Answer 23

* What gets restored: * Table data * GSIs and LSIs (optional, you can choose) * Encryption settings (you can change) * Provisioned RCUs / WCUs (with values at the time when backup was created) * Billing mode (with value at the time when backup was created) * What you must manually set up on the restored table: * Auto scaling policies, IAM policies * CloudWatch metrics and alarms * Stream and TTL settings * Tags

Answer 24

* Restore table data to any second in the last 35 days! * Priced per GB based on the table size * If you disable PITR and re-enable it, the 35 days clock gets reset * Works with unencrypted, encrypted tables as well as global tables * Can be enabled on each local replica of a global table * If you restore a table which is part of global tables, the restored table will be an independent table (won’t be a global table anymore!) * Always restores data to a new table * What cannot be restored * Stream settings * TTL options * Autoscaling config * PITR settings * Alarms and tags * All PITR API calls get logged in CloudTrail

Answer 25

Server-side Encryption at Rest * Enabled by default * Uses KMS * 256-bit AES Encryption * Can use AWS owned CMK, AWS managed CMK, or customer managed CMK * Encrypts primary key, secondary indexes, streams, global tables, backups and DAX clusters * Encryption in transit * Use VPC endpoints for applications running in a VPC * Use TLS endpoints for encrypting data in transit

Answer 26

* For client-side encryption * Added protection with encryption in-transit * Results in end-to-end encryption * Doesn't encrypt the entire table * Encrypts the attribute values, but not the attribute names * Doesn't encrypt values of the primary key attributes * You can selectively encrypt other attribute values * You can encrypt selected items in a table, or selected attribute values in some or all items

Answer 27

* 24 Hours time-ordered log of all table-write activity * React to changes to DynamoDB tables in real time * Can be read by AWS Lambda, EC2, ES, Kinesis… * DynamoDB Streams are organized into shards * Records are not retroactively populated in a stream after enabling it * Simply enable streams from DynamoDB console

Answer 28

captures only the key attributes of the changed item

Answer 29

captures the entire item after changes

Answer 30

captures the entire item before changes

Answer 31

captures the entire item before and after changes

Database Specialty - DynamoDB Flashcards

(55 cards)