3-Databases Flashcards
How is data on DynamoDB stored?
On SSDs that are replicated in 3 geographically distinct data centres
What are the conceptual elements of DynamoDB?
Tables are broad collections of data, they consist of items which each have attributes
It supports both the key-value and document data models
How can access to DynamoDB be restricted?
IAM provides very fine-grained control.
The dynamodb:LeadingKeys and dynamodb:attributes conditions restrict access to items and attributes respectively
How are items in DynamoDB reference?
With a primary key
How do primary keys work in DyanmoDB?
There are two kinds: a partition key is based on a unique attribute i.e. user ID
If the partition key is not unique, a composite key is used instead. It consists of a partition key and a sort key
What is the purpose of database indexes?
They allow fast queries to be performed on specific columns/attributes
What indexes does DynamoDb support?
Local Secondary Index - same partition key as the original table but different sort key, Can only be created when the table is created
Global Secondary Index - can be added later. Hase a differetn partition key and a different sort key as the original table
What API calls are used to get data out of DynamoDB?
Query, Scan, Get and BatchGetItem
How does Query work in DynamoDB?
It finds items based on the primary key of a table or index.
It is eventually consistent by default but can be set to strongly consistent
How can Query be refined?
It usually returns all attributes but the ProjectionExpression parameter can control this
Usually results are ascending by sort key but setting ScanIndexForward to false reverses this
How does Scan work in DynamoDB?
It dumps all items in a table and filters those of interest
The attributes it returns can be controlled with the ProjectionExpression parameter
How can the performance impact of scans be minimised?
Performing parallel scans - by default data is processed sequentially in 1 MB blocks
Set a lower-page size as this means fewer items are returned per API call
What are the consistency models of Scan and Query?
Scan is eventually consistent
Query is eventually consistent by default but can be made strongly consistent
How is DyanmoDB capacity managed?
Manually with capacity units, or automatically with the On-Demand Capacity option
How is DynamoDB throughput calculated?
Each write capacity unit allows for 1 x 1 KB write per second
Each read capacity unit allows for either 1 x 4 KB strong consistent read per second or 2 x 4 KB eventually consistent reads per second
How can DynamoDB be made even more performant?
With DAX, an in-memory caching service
What caching strategy does DAX use?
Write-through
What operations does DAX accelerate?
Only read operations that can be eventually-consistent
What happens if the request rate for a table is too high? How can this be managed?
The client will trigger a ProvisionThroughputExceed exception.
This is managed by retrying with exponential backoff
What is the consistency model of DAX?
It is eventually consistent
What database model do DynamoDB Transactions enable?
ACID (Atomic, Consistent, Isolated, Durable)
What is ACID?
A requirement that database operations are:
Atomic - all or nothing
Consistent - always leave the database in a valid state
Isolated - no cross-transaction dependency
Durable - successfully committed changes remain if the system goes down
How can old items in DynamoDB be managed?
The ExpirationTime attribute sets them to expire when that time is reached
Expired items are deleted within 24 hours
What are DynamoDB streams?
Logs consisting of item level events to that DyanmoDB table in time-order
They are encrypted and retained for 24 hours
By default, only the primary key is recorded. If more is needed, use Before and After Images
What is ElastiCache?
An in-memory cache that optimises read-heavy applications or those that have repeated compute-intensive queries
What cache engines does ElastiCache support? How do they differ?
Memcached - isn’t clustered so no multi-AZ support
Redis - Supports complex data structures such as sorted lists and sets. It supports master / slave replication and mult-AZ
What cache strategies does ElastiCache support?
Both engines support Lazy loading and write-through
How does ElastiCache prevent caches from becoming stale?
With TTL