NOSQL Database & DynamoDB Flashcards

1
Q

DynamoDB

A

NoSQL Public Database-as-a-service(DBaas)-key/vale & document

Manual/Automatic provisioned performace in/out or on-demand

Really fast .. single-digit milisecond(ssd based)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the resiliency of DynamoDB?

A

highly resilient across AZ and optionally globally

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the Capacity of a Dynamo Table and it’s units

A

Capacity is speed

(Writes) 1 WCU = 1KB per second

(Reads) 1 RCU = 4KB per second

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

DynamoDB Backups

A

On demand Backups

Point-in-time Recovery(PITR)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

DynamoDB billing

A

billed based RCU ,WCU, storage and features

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What the one Requirement for data entering DynamoDB ?

A

Has to have a unique simple(partition) or Composite (partition & Sort) Primary Key

Each item must have a unique value for PK and SK, Can have none ,all ,mixture or different attributes (DDB has no ridged attributes schema)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

DynamoDB Query

A

Query accepts a single PK value and optionally a SK or range .Capacity consumed is the size of all returned items. Further filtering discards data-capacity is still consumed. Can Only query on PK or PK and SK.

Always beneficial to return more items because every read consumes as least 1RCU because the value read is always rounded up to 1RCU

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

DynamoDB capacity modes

A

on-Demand - you just pay for the operations on the table. unknown unpredictable, low admin.

provisioned - you have to set the capacity values on a per table basis. price per millions R or W units

more expensive - price per millions R or W units

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

DynamoDB Scan

A

Scan moves through a table consuming the capacity of every ITEM. You have complete control on what data is selected, any attributes can be used and any filters applied but scan consumes capacity for every item scanned through.

Most flexible but most expensive when it come to capacity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

DynamoDB Consistency Model

A

Eventually consistent reads = check ½ nodes - could be unlucky with stale data if a node is checked before replication completes. 50% of the cost vs. strongly consistent.

Strongly consistent reads = connect to the leader node to get the most up-to-date copy of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How would you calculate the WCU on your table if you need to store 10 items per second … with 2.5K average size per item?

A

calculate WCU per items .. round up (item size/1KB)(3)

Multiply by average number per second (30)

= WCU Required (30)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How would you calculate the RCU on your table if you need to store 10 items per second … with 2.5K average size per item? What if the capacity mode was eventually consistent ?

A

calculate RCU per item… round up (items size/4KB)(1)

Multiply by average read ops per second(10)

= strongly consistent RCU required(10)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

DynamoDB Indexes

A

indexes are alternative views on table data

Different SK(LSI) or Different PK and SK(GSI)

some or all the attributes (projections)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Dynamo DB Local Secondary Indexes(LSI)

A

LSI is an alternative view for a table

Must be created with a table

5 LSI’s per base table

Shares the TCU and WCU with the table

Attributes - ALL, KEYS_ONLY & INCLUDE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

DynamoDB Global Secondary Indexes(GSI)

A

can be created at any time

Default limit of 20 per base table

Alternative PK and SK

GSI’s have their own RCU and WCU allocations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Draw back of Global Secondary Indexes(GSI)

A

GSI’s are always eventually consistent, replication between base and GSI is Asynchronous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

When would you use GSI vs LSI on a Dynamo table ?

A

Use GSI’s as default , LSI only when strong consistency is required

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Dynamo Stream

A

Time ordered list of items changes in a table

24-hour rolling window

enabled on a per table basis

records INSERTS , UPDATES , and DELETES

Difference view types influence what is in the stream

19
Q

DynamoDB Triggers

A

ITEM changes generate an event

that even contains the data which changed

A action is taken using that data

AWS = Streams + Lambda

Reporting & analytics

aggregation, message or notifications

20
Q

DynamoDB Stream view types

A

KEYS_ONLY → PK and SK

NEW_IMAGE → entire item after change

OLD_IMAGE → entire item before change

NEW_AND_OLD_IMAGES → old item and new item

21
Q

DynamoDB Global Tables

A

Global table provides multi-maser cross-region replication

Tables are created in multiple regions and added to the same global table (becoming replica table)

22
Q

DynamoDB Global Tables : Last writer Wins

A

a way for conflict resolution the most recent write wins if there are two competing writes on a table

23
Q

DynamoDB Global Tables resiliency

A

Read and Writes can occur to any region

Generally sub-second replication between regions

24
Q

DynamoDB Global Tables consistency

A

Only strongly consistent in the same region as writes everything else is eventually consitent

25
Q

DynamoDB Accelerator (DAX)

A

is an in-memory cache designed specifically for DynamoDB.

Primary Node(writes) and replicas (read)

in -memory cache - scaling much faster reads , reduce cost

scale up and scales out (Bigger or More)

26
Q

Amazon Athena

A

serverless interactive querying service

Ad-hoc queries on data-pay only data consumed

schema-on read Table like translation

27
Q

ElastiCache

A

In-memory database high performance

Managed Redis or Memcached as service

Can be used to cache data - for Heavy Workloads with low Latency requirements

reduces database Workloads(expensive

can be used to store session data(stateles servers)

Requires application code changes

28
Q

ElastiCache MemcacheD engine

A

Simple data structure

no replication

multiple Nodes(sharing)

No backups

Muti-threaded

29
Q

ElastiCache Redis engine

A

Advanced structures

multi-AZ

replication(Scale Reads)

Backups & Restores

Transactions

30
Q

RedShift Architectures

A

Petabyte-scale Data Warehouse

Online Analytic Processing OLAP(Column based) not Online transaction processing i.e OLTP(row/transaction)

Pay as you use similar structure to RDS

31
Q

RedShift Benefits

A

Direct Query s3 using Redshift Spectrum

Direct Query other DBS using Federated Query

Integrates with AWS tooling such as Quick Sight

SQL-like interface JDBC/ODBC connections

32
Q

RedShift resiliency

A

one AZ in a VPC

33
Q

How does RedShift work

A

Leader Nodes - query input, planning and aggregation

compute Node - performing queries of data

34
Q

RedShift Intergrations

A

VPC security , IAM permissions , KMS at rest Encryption , CW monitoring

35
Q

RedShift Enhanced VPC Routing

A

By default Redshift uses public routes for traffic when communicating with

external services or any AWS services such as S3if you enable enhanced VPC routing then traffic is route based on your VPC

networking configuration This means it can be controlled by security groups, NACLS , and it can use custom DNS. IT will also require the use of VPC gateways that any other traffic requires.

36
Q

What is a burst pool and how many and can you count on it for normal Workloads ?

A

Every table has a RCU and WCU burst pool (300 seconds)

if you ever deplete the pool you will get an error provision throughput exceeded and be throttled.

37
Q

What is the max item size in a DynamoDB table ?

A

400 KB

38
Q

DynamoDB Demand Backups

A

full back up of the table and remain until you delete them

39
Q

DynamoDB Backups Point-in-time Recovery(PITR)

A

Not enabled by default set per table when enabled it allows continuous record of changes with 1 second granularity allows replay to any point in the window

40
Q

DynamoDB Restores

A

can be same or cross region

with or without indexes

with adjusted Encryption settings.

41
Q

Where is DAX Deployed ?

A

Deployed width a VPC

42
Q

Does DAX Supports Write- through ? and if so what does it mean for the application.

A
43
Q

What makes Athena different than Redshift or DynamoDB

A

original data never changed - remains on S3

schema translate data => relational -like when read

output can be sent to other services