NoSQL Flashcards
What is DynamoDB?
o DynamoDB is a NoSQL database service. It is a global service, partitioned regionally and allows creation of tables
o It is a publicly accessible service, private by default: access is managed through identity policies
o It is resilient on a regional basis: when creating a table, it stores it on at least 3 replicas on different AZs; a status code of 200 means data has been written successfully on at least 3 replicas
What are Tables, Items and Attributes in DynamoDB?
o A table is a collection of items that share the same partition (or hash) key (PK) or partition key and sort (or range) key (SK) together with other configuration and performance settings
o Table names need to unique per account/region
o An item is a collection of attributes (up to 400Kb in size combined for the single item) inside a table that shares the same key structure as every other item in the table
o An attribute is a combination of key and value, expressed in JSON format
What is the difference between Scan and Query in DynamoDB?
o Scan operation: if launched on a table with no parameters, it retrieves every item in that table; supports filters, but even with filters, it performs a full table scan (and just filters out non relevant data)
o Query operation: allows to perform lookups on a table, without having to read the whole table (hence much more efficient and less resource consuming), if filtering only on Partition ID and/or Sort Key
How do Backup/Restore and Encryption work in DynamoDB?
o Backup and restore
Supports point-in-time recovery, but needs to be explicitly enabled on a per table basis, supports up to 35 days in the past
Supports manual explicit table backups (which will happen to a new table)
o Supports encryption, either Default (server-side E. using AWS owned CMK) or KMS (server-side E. using AWS managed CMK)
What are Global Tables in DynamoDB?
o Supports Global Tables, which are not linked to a specific Region: requires streams to be enabled, and one or more additional Regions which will host replica tables; can only be set on empty tables
What are the 2 read/write capacity modes in DynamoDB, and how does billing work?
o DynamoDB has two read/write capacity modes: provisioned throughput (default) and on-demand mode – you can switch between the 2 once every 24 hours
o On demand mode: DynamoDB automatically scales to handle performance demands and bills a per-request charge
o Provisioned throughput: each table is configured with read capacity units (RCUs) and write capacity units (WCUs); every operation on ITEMS consumes at least 1 RCU or WCU – no partial consumption of units
o Read Capacity Units: 1 RCU is 4 Kb of data read from a table per second, in a strongly consistent way (double that in eventually consistent mode). Reading 2Kb consumes 1 RCU, reading 4.5Kb consumes 2 RCUs. 1 RCU allows for 2 x 4Kb of data reads per second. Atomic transactions require 2x the RCU
o Write Capacity Units: One WCU is 1 Kb of data or less written to a table. An operation that writes 200 bytes consumes 1 WCU, an operation that writes 2Kb consumes 2 WCU. Atomic transactions require 2x the WCU to complete
What are DynamoDB Streams and Triggers?
o When enabled, streams provide an ordered list of changes that occur to items within a DynamoDB table. A stream is a rolling 24-hour window of changes. Streams are enabled per table and only contain data from the point of being enabled
o Every stream has an ARN that identifies it globally across all tables, accounts and regions
o Streams can be configured with one of four view types:
KEYS_ONLY: whenever an item is added, updated or deleted, the key(s) of that item are added to the stream
NEW_IMAGE: The entire item is added to the stream, post change
OLD_IMAGE: The entire item is added to the stream, pre change
NEW_AND_OLD_IMAGES: Both new and old versions of the item are added to the stream
o Triggers: Streams can be integrated with AWS Lambda, invoking a function whenever items are changed in a DynamoDB table (DB triggers)
What 2 types of Indexes does DynamoDB support?
o Indexes provide an alternative representation of data in a table, which is useful for applications with varying query demands. They come in two forms: local secondary indexes (LSI) and global secondary indexes (GSI)
o Indexes are interacted with as though they were tables, but they are just an alternate representation of data in an existing table
o Local secondary indexes (max 5 per table) must be created at the same time as creating a table. They use the same partition key, but an alternative sort key. They share the RCU and WCU values for the main table
o Global secondary indexes (max 20 per table) can be created at any point after the table is created. They can use different partition and sort keys. They have their own RCU and WCU values.
o Note: Strongly consistent reads can’t be performed on a GSI. Only eventually consistent reads can be performed because the data that’s stored in the table is replicated asynchronously to the GSI.