Database Specialty - DynamoDB Flashcards

1
Q

Tool for Backup and restore

A

PITR

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Terminology DynamoDB

A

Tables, Items, Attributes, Primary Keys, Local Secondary Indexes, Global Secondary Indexes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Data Types in DynamoDB

A

Scalar, Set, Document

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Important points Read Consistency

A

Strong, Eventual and Transacional

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Points Write Consistency

A

Standard and Transacional

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Modes Pricing Model

A

Provisioned and On-Demand Capacity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Types of caches in DAX

A

Item Cache and Query Cache

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Scaling Options

A

Automatic, Provisioned, Global Replication, Burst Capacity, On-Demand Capacity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Amazon DynamoDB – Overview Points

A
  • Non-relational Key-Value store
  • Fully Managed, Serverless, NoSQL database in the cloud
  • Fast, Flexible, Cost-effective, Fault Tolerant, Secure
  • Multi-region, multi-master database (Global Tables)
  • Backup and restore with PITR (Point-in-time Recovery)
  • Single-digit millisecond performance at any scale
  • In-memory caching with DAX (DynamoDB Accelerator, microsecond latency)
  • Supports CRUD (Create/Read/Update/Delete) operations through APIs
  • Supports transactions across multiple tables (ACID support)
  • No direct analytical queries (No joins)
  • Access patterns must be known ahead of time for efficient design and performance
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

DynamoDB Tables

A
  • Tables are top-level entities
  • No strict inter-table relationships (Independent Entities)
  • You control performance at the table level
  • Table items stored as JSON (DynamoDB-specific JSON)
  • Primary keys are mandatory, rest of the schema is flexible
  • Primary Key can be simple or composite
  • Simple Key has a single attribute (=partition key or hash key)
  • Composite Key has two attributes
    (=partition/hash key + sort/range key)
  • Non-key attributes (including secondary key attributes) are
    optional
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Data Types in DynamoDB

A
  • Scalar Types
    • Exactly one value
    • e.g. string, number, binary, boolean, and null
    • Keys or index attributes only support string, number and binary scalar types
  • Set Types
    • Multiple scalar values
    • e.g. string set, number set and binary set
  • Document Types
    • Complex structure with nested attributes
    • e.g. list and map
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

AWS Global Infrastructure

A
  • Has multiple AWS Regions across
    the globe
  • Each region has one or more AZs
    (Availability Zones)
  • Each AZ has one or more
    facilities (= Data Centers)
  • DynamoDB automatically
    replicates data between multiple
    facilities within the AWS region
  • Near Real-time Replication
  • AZs act as independent failure
    domains
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

DynamoDB Consistency

A
  • Read Consistency: strong consistency, eventual consistency, and transactional
  • Write Consistency: standard and transactional
  • Strong Consistency
    • The most up-to-date data
    • Must be requested explicitly
  • Eventual Consistency
    • May or may not reflect the latest copy of
      data
    • Default consistency for all operations
    • 50% cheaper than strong consistency
  • Transactional Reads and Writes
    • For ACID support across one or more
      tables within a single AWS account and
      region
    • 2x the cost of strongly consistent reads
    • 2x the cost of standard writes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Strongly Consistent Read vs Eventually Consistent Read

A
  • Eventually Consistent Read: If we read just
    after a write, it’s possible we’ll get
    unexpected response because of replication
  • Strongly Consistent Read: If we read just
    after a write, we will get the correct data
  • By default: DynamoDB uses Eventually
    Consistent Reads, but GetItem, Query &
    Scan provide a “ConsistentRead” parameter
    you can set to True
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

DynamoDB Pricing Model - Provisioned Capacity

A
  • You pay for the capacity you provision
    (= number of reads and writes per second)
  • You can use auto-scaling to adjust the
    provisioned capacity
  • Uses Capacity Units: Read Capacity Units
    (RCUs) and Write Capacity Units (WCUs)
  • Consumption beyond provisioned capacity may
    result in throttling
  • Use Reserved Capacity for discounts over 1 or
    3-year term contracts (you’re charged a one- time fee + an houtly fee per 100 RCUs and
    WCUs
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

DynamoDB Pricing Model - Provisioned Capacity - On-Demand Capacity

A

On-Demand Capacity
* You pay per request (= number of read and
write requests your application makes)

  • No need to provision capacity units
  • DynamoDB instantly accommodates your
    workloads as they ramp up or down
  • Uses Request Units: Read Request Units and
    Write Request Units
  • Cannot use reserved capacity with On-Demand mode
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

DynamoDB Throughput - Provisioned Capaciy mode

A
  • Uses Capacity Units
    • 1 capacity unit = 1 request/sec
  • RCUs (Read Capacity Units)
    • In blocks of 4KB, last block always rounded up
    • 1 strongly consistent table read/sec = 1 RCU
    • 2 eventually consistent table reads/sec = 1 RCU
    • 1 transactional read/sec = 2 RCUs
  • WCUs (Write Capacity Units)
    • In blocks of 1KB, last block always rounded up
    • 1 table write/sec = 1 WCU
    • 1 transactional write/sec = 2 WCUs
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

DynamoDB Throughput - On-Demand Capacity mode

A
  • Uses Request Units
    • Same as Capacity Units for calculation purposes
  • Read Request Units
    • In blocks of 4KB, last block always
      rounded up
    • 1 strongly consistent table read request = 1 RRU
    • 2 eventually consistent table read request = 1 RRU
    • 1 transactional read request = 2 RRUs
  • Write Request Units
    • In blocks of 1KB, last block always rounded up
    • 1 table write request = 1 WRU
    • 1 transactional write request = 2 WRUs
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Provisioned Capacity - Points

A
  • Typically used in production environment
  • Use this when you have predictable traffic
  • Consider using reserved capacity if you
    have steady and predictable traffic for
    cost savings
  • Can result in throttling when
    consumption shoots up (use auto-scaling)
  • Tends to be cost-effective as compared
    to the on-demand capacity mode
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

On-Demand Capacity Mode

A
  • Typically used in dev/test environments
    or for small applications
  • Use this when you have variable,
    unpredictable traffic
  • Instantly accommodates up to 2x the
    previous peak traffic on a table
  • Throttling can occur if you exceed 2x
    the previous peak within 30 minutes
  • Recommended to space traffic growth
    over at least 30 mins before driving
    more than 2x
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Example 1: Calculating Capacity Units

A

Calculate capacity units to read and write a 15KB item

  • RCUs with strong consistency:
    • 15KB/4KB = 3.75 => rounded up => 4 RCUs
  • RCUs with eventual consistency:
    • (1/2) x 4 RCUs = 2 RCUs
  • RCUs for transactional read:
    • 2 x 4 RCUs = 8 RCUs
  • WCUs:
    • 15KB/1KB = 15 WCUs
  • WCUs for transactional write:
    • 2 x 15 WCUs = 30 WCUs
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Example 2: Calculating Capacity Units

A

Calculate capacity units to read and write a 1.5KB item

  • RCUs with strong consistency:
    • 1.5KB/4KB = 0.375 => rounded up => 1 RCU
  • RCUs with eventual consistency:
    • (1/2) x 1 RCUs = 0.5 RCU => rounded up = 1 RCU
  • RCUs for transactional read: * 2 x 1 RCU = 2 RCUs
  • WCUs: * 1.5KB/1KB = 1.5 => rounded up => 2 WCUs
  • WCUs for transactional write:
    • 2 x 2 WCUs = 4 WCUs
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Example 3: Calculating Throughput
A DynamoDB table has provisioned capacity of 10
RCUs and 10 WCUs. Calculate the throughput that
your application can support:

A
  • Read throughput with strong consistency = 4KB x 10 = 40KB/sec
  • Read throughput (eventual) = 2 (40KB/sec) = 80KB/sec
  • Transactional read throughput = (1/2) x (40KB/sec) = 20KB/sec
  • Write throughput = 1KB x 10 = 10KB/sec
  • Transactional write throughput = (1/2) x (10KB/sec) = 5KB/sec
24
Q

DynamoDB Burst Capacity

A
  • To provide for occasional bursts
    or spikes
  • 5 minutes or 300 seconds of
    unused read and write capacity
  • Can get consumed quickly
  • Must not be relied upon
25
Q

DynamoDB Adaptive Capacity

A
  • Total provisioned capacity = 600 WCUs per sec
  • Provisioned capacity per partition = 200 WCUs per sec
  • Unused capacity = 200 WCUs per sec
  • So the hot partition can consume these
    unused
    200 WCUs per sec above its allocated capacity
  • Consumption beyond this results in throttling
  • For Non-uniform Workloads
  • Works automatically and applied in real time
  • No Guarantees
26
Q

DynamoDB LSI (Local Secondary Index)

A
  • Can define up to 5 LSIs
  • Has same partition/hash key attribute as the
    primary index of the table
  • Has different sort/range key than the primary index of the table
  • Must have a sort/range key (=composite key)
  • Indexed items must be ≤ 10 GB
  • Can only be created at the time of creating the table and cannot be deleted later
27
Q

DynamoDB GSI (Global Secondary Index)

A
  • Can define up to 20 GSIs (soft limit)
  • Can have the same or different partition/hash key then the table’s primary index
  • Can have the same or different sort/range key then the table’s primary index
  • Can omit sort/range key (=simple and
    composite)
  • No size restrictions for indexed items
  • Can be created or deleted at any time. Can
    delete only one GSI at a time
  • Can query across partitions (over the entire table)
  • Support only eventual consistency
  • Has its own provisioned throughput
  • Can only query projected attributes (attributes included in the index)
28
Q

When to choose which index? Local Secondary Indexes

A
  • When application needs same partition key
    as the table
  • When you need to avoid additional costs
  • When application needs strongly consistent index reads
29
Q

When to choose which index? Global Secondary Indexes

A
  • When application needs different or same
    partition key as the table
  • When application needs finer throughput control
  • When application only needs eventually
    consistent index reads
30
Q

DynamoDB Indexes and Throttling, LOCAL SECONDARY INDEXES

A
  • Uses the WCU and RCU of the main
    table
  • No special throttling considerations
31
Q

DynamoDB Indexes and Throttling - Global Secondary Indexes

A
  • If the writes are throttled on the GSI, then the main table will be throttled! (even if the WCU on the main tables are fine)
  • Choose your GSI partition key carefully!
  • Assign your WCU capacity carefully!
32
Q

Simple design patterns with DynamoDB

A
  • You can model different entity relationships like 1:1, 1:N, N:M
  • Store players’ game states – 1:1 modeling, 1:N modeling
    • user_id as PK, game_id as SK (1:N modeling)
  • Players’ gaming history – 1:N modeling
    • user_id as PK, game_ts as SK (1:N modeling)
  • Gaming leaderboard – N:M modeling
    • GSI with game_id as PK and score as SK
33
Q

DynamoDB Write Sharding

A
  • Imagine we have a voting application with two candidates, candidate A and candidate B.
  • If we use a partition key of candidate_id, we will run into partitions issues, as we
    only have two partitions
  • Solution: add a suffix (usually random suffix, sometimes calculated suffix)
34
Q

Error and Exceptions in DynamoDB

A
  • Common Exceptions
  • Access Denied Exception
  • Conditional Check Failed Exception
  • Item Collection Size Limit Exceeded Exception
  • Limit Exceeded Exception
  • Resource In Use Exception
  • Validation Exception
  • Provisioned Throughput Exceeded Exception
  • Error Retries
  • Exponential Backoff
35
Q

DynamoDB Partitions

A
  • Store DynamoDB table data (physically)
  • Each (physical) partition = 10GB SSD volume
  • Not to be confused with table’s partition/hash key (which is a logical
    partition)
  • One partition can store items with
    multiple partition keys
  • A table can have multiple partitions
  • Number of table partitions depend on
    its size and provisioned capacity
  • Managed internally by DynamoDB
  • Provisioned capacity is evenly distributed across table partitions
  • Partitions once allocated, cannot be
    deallocated (important!)
36
Q

Calculating DynamoDB Partitions

A

1 partition = 1000 WCUs or 3000 RCUs (Maximum supported throughput per partition)
* 1 partition = 10GB of data
* No. of Partitions = Either the number of partitions based on throughput or the number of partitions based on size, whichever is higher

37
Q

Partition Behavior Example (Scaling up Capacity)

A
  • Provisioned Capacity: 500 RCUs and 500 WCUs
  • Storage requirement < 10 GB
  • Number of Partitions:
    PT = ( 500 RCUs/3000 + 500 WCUs/1000)
    = 0.67 => rounded up => 1 partition
  • Say, we scale up the provisioned capacity
  • New Capacity: 1000 RCUs and 1000 WCUs
    PT = ( 1000 RCUs/3000 + 1000 WCUs/1000)
    = 1.33 => rounded up => 2 partitions
38
Q

DynamoDB Scaling

A
  • You can manually scale up provisioned capacity as and when needed
  • You can only scale down up to 4 times in a day
  • Additional one scale down if no scale downs in last 4 hours
  • Effectively 9 scale downs per day
  • Scaling affects partition behavior
  • Any increase in partitions on scale up will not result in decrease on scale down (Important!)
  • Partitions once allocated will not get deallocated later
39
Q

DynamoDB Accelerator (DAX)

A
  • In-Memory Caching, microsecond latency
  • Sits between DynamoDB and Client Application (acts a proxy)
  • Saves costs due to reduced read load on DynamoDB * Helps prevent hot partitions
  • Minimal code changes required to add DAX to your existing DynamoDB app
  • Supports only eventual consistency (strong consistency requests pass-through
    to DynamoDB)
  • Not for write-heavy applications
  • Runs inside the VPC
  • Multi AZ (3 nodes minimum recommended for production)
  • Secure (Encryption at rest with KMS, VPC, IAM, CloudTrail…)
40
Q

DAX architecture

A
  • DAX has two types of caches (internally)
  • Item Cache
  • Query Cache
  • Item cache stores results of index reads (=GetItem and BatchGetItem)
  • Default TTL of 5 min (specified while creating DAX cluster)
  • When cache becomes full, older and less popular items get removed
  • Query cache stores results of Query and Scan operations
  • Default TTL of 5 min
  • Updates to the Item cache or to the underlying DynamoDB
    table do not invalidate the query cache. So, TTL value of the query cache should be chosen accordingly.
41
Q

DAX Operations

A
  • Only for item level operations
  • Table level operations must be sent directly to DynamoDB
  • Only for item level operations
  • Table level operations must be sent directly to DynamoDB
  • Write Operations use write-through approach
  • Data is first written to DynamoDB and then to DAX, and write operation is considered as successful only if both writes are successful
  • You can use write-around approach to bypass DAX, e.g. for writing large amount of data, you can write directly to DynamoDB (Item cache goes out of sync)
42
Q

DAX Operations 2

A
  • Only for item level operations ]
  • Table level operations must be sent directly to DynamoDB
  • Write Operations use write-through approach
  • Data is first written to DynamoDB and then to DAX, and write operation is considered as successful only if both writes are successful
  • You can use write
    -around approach to bypass DAX,
    e.g. for writing large amount of data, you can write directly to DynamoDB (Item cache goes out of sync)
  • For reads, if DAX has the data (=Cache hit), it’s simply returned without going through DynamoDB
43
Q

Implementing DAX

A
  • To implement DAX, we create a DAX Cluster * DAX Cluster consists of one or more nodes (up to 10 nodes per cluster)
  • Each node is an instance of DAX
  • One node is the master node or primary node
  • Remaining nodes act as read replicas
  • DAX internally handles load balancing between these nodes
  • 3 nodes minimum recommended for production
44
Q

Backup and Restore in DynamoDB

A
  • Automatically encrypted, cataloged and easily discoverable
  • Highly Scalable - create or retain as many backups for tables of any size
  • Backup operations complete in seconds
  • Backups are consistent within seconds across thousands of partitions
  • No provisioned capacity consumption
  • Does not affect table performance or availability
  • Backups are preserved regardless of table deletion
45
Q

Backup and Restore in DynamoDB v2

A
  • Can backup within the same AWS region as the table
  • Restores can be within same region or cross region
  • Integrated with AWS Backup service (can create periodic backup plans)
  • Periodic backups can be scheduled using Lambda and CloudWatch triggers
  • Cannot overwrite an existing table during restore, restores can be done only to a new table (=new name)
  • To retain the original table name, delete the existing table before running restore
  • You can use IAM policies for access control
46
Q

Backup and Restore in DynamoDB v3

A
  • Restored table gets the same
    provisioned RCUs/WCUs as the source table, as recorded at the time of backup
  • PITR RPO = 5 minutes approx.
  • PITR RTO can be longer as restore operation creates a new table
47
Q

Backup and Restore in DynamoDB v4

A
  • What gets restored:
    • Table data
    • GSIs and LSIs (optional, you can choose)
    • Encryption settings (you can change)
    • Provisioned RCUs / WCUs (with values at
      the time when backup was created)
    • Billing mode (with value at the time when backup was created)
  • What you must manually set up on the restored table:
    • Auto scaling policies, IAM policies
    • CloudWatch metrics and alarms
    • Stream and TTL settings
    • Tags
48
Q

Continuous Backups with PITR

A
  • Restore table data to any second in
    the last 35 days!
  • Priced per GB based on the table size
  • If you disable PITR and re-enable it,
    the 35 days clock gets reset
  • Works with unencrypted, encrypted
    tables as well as global tables
  • Can be enabled on each local replica
    of a global table
  • If you restore a table which is part of
    global tables, the restored table will be an
    independent table (won’t be a global table
    anymore!)
  • Always restores data to a new table
  • What cannot be restored
  • Stream settings
  • TTL options
  • Autoscaling config
  • PITR settings
  • Alarms and tags
  • All PITR API calls get logged in CloudTrail
49
Q

DynamoDB Encryption

A

Server-side Encryption at Rest
* Enabled by default
* Uses KMS
* 256-bit AES Encryption
* Can use AWS owned CMK, AWS managed
CMK, or customer managed CMK
* Encrypts primary key, secondary indexes,
streams, global tables, backups and DAX clusters

  • Encryption in transit
    • Use VPC endpoints for applications
      running in a VPC
    • Use TLS endpoints for encrypting data in
      transit
50
Q

DynamoDB Encryption Client

A
  • For client-side encryption
  • Added protection with encryption in-transit
  • Results in end-to-end encryption
  • Doesn’t encrypt the entire table
  • Encrypts the attribute values, but not the attribute names
  • Doesn’t encrypt values of the primary key attributes
  • You can selectively encrypt other attribute values
  • You can encrypt selected items in a table, or selected attribute values in some or all items
51
Q

DynamoDB Streams

A
  • 24 Hours time-ordered log of all table-write activity
  • React to changes to DynamoDB tables in real time
  • Can be read by AWS Lambda, EC2, ES, Kinesis…
  • DynamoDB Streams are organized into
    shards
  • Records are not retroactively populated
    in a stream after enabling it
  • Simply enable streams from DynamoDB
    console
52
Q

DynamoDB Streams - supported views - Keys only

A

captures only the key attributes
of the changed item

53
Q

DynamoDB Streams - supported views - New image

A

captures the entire item
after changes

54
Q

DynamoDB Streams - supported views - Old image

A

captures the entire item
before changes

55
Q

DynamoDB Streams - supported views - New and old images

A

captures the entire item
before and after changes