Amazon DynamoDB | Amazon DynamoDB Accelerator (DAX) Flashcards
If I query for an expired item, does it use up my read capacity?
Amazon DynamoDB Accelerator (DAX)
Amazon DynamoDB | Database
Yes. This behavior is the same as when you query for an item that does not exist in the table.
What is DynamoDB Accelerator (DAX)?
Amazon DynamoDB Accelerator (DAX)
Amazon DynamoDB | Database
Amazon DynamoDB Accelerator (DAX) is a fully managed, highly available, in-memory cache for DynamoDB that enables you to benefit from fast in-memory performance for demanding applications. DAX improves the performance of read-intensive DynamoDB workloads so repeat reads of cached data can be served immediately with extremely low latency, without needing to be re-queried from DynamoDB. DAX will automatically retrieve data from DynamoDB tables upon a cache miss. Writes are designated as write-through (data is written to DynamoDB first and then updated in the DAX cache).
Just like DynamoDB, DAX is fault-tolerant and scalable. A DAX cluster has a primary node and zero or more read-replica nodes. Upon a failure for a primary node, DAX will automatically fail over and elect a new primary. For scaling, you may add or remove read replicas.
To get started, create a DAX cluster, download the DAX SDK for Java or Node.js (compatible with the DynamoDB APIs), re-build your application to use the DAX client as opposed to the DynamoDB client, and finally point the DAX client to the DAX cluster endpoint. You do not need to implement any additional caching logic into your application as DAX client implements the same API calls as DynamoDB.
What does “DynamoDB-compatible” mean?
Amazon DynamoDB Accelerator (DAX)
Amazon DynamoDB | Database
It means that most of the code, applications, and tools you already use today with DynamoDB can be used with DAX with little or no change. The DAX engine is designed to support the DynamoDB APIs for reading and modifying data in DynamoDB. Operations for table management such as CreateTable/DescribeTable/UpdateTable/DeleteTable are not supported.
What is in-memory caching, and how does it help my application?
Amazon DynamoDB Accelerator (DAX)
Amazon DynamoDB | Database
Caching improves application performance by storing critical pieces of data in memory for low-latency and high throughput access. In the case of DAX, the results of DynamoDB operations are cached. When an application requests data that is stored in the cache, DAX can serve that data immediately without needing to run a query against the regular DynamoDB tables. Data is aged or evicted from DAX by specifying a Time-to-Live (TTL) value for the data or, once all available memory is exhausted, items will be evicted based on the Least Recently Used (LRU) algorithm.
What is the consistency model of DAX?
Amazon DynamoDB Accelerator (DAX)
Amazon DynamoDB | Database
When reading data from DAX, users can specify whether they want the read to be eventually consistent or strongly consistent:
Eventually Consistent Reads (Default) – the eventual consistency option maximizes your read throughput and minimizes latency. On a cache hit, the DAX client will return the result directly from the cache. On a cache miss, DAX will query DynamoDB, update the cache, and return the result set. It should be noted that an eventually consistent read might not reflect the results of a recently completed write. If your application requires full consistency, then we suggest using strongly consistent reads.
Strongly Consistent Reads — in addition to eventual consistency, DAX also gives you the flexibility and control to request a strongly consistent read if your application, or an element of your application, requires it. A strongly consistent read is pass-through for DAX, does not cache the results in DAX, and returns a result that reflects all writes that received a successful response in DynamoDB prior to the read.
What are the common use cases for DAX?
Amazon DynamoDB Accelerator (DAX)
Amazon DynamoDB | Database
DAX has a number of use cases that are not mutually exclusive:
Applications that require the fastest possible response times for reads. Some examples include real-time bidding, social gaming, and trading applications. DAX delivers fast, in-memory read performance for these use cases.
Applications that read a small number of items more frequently than others. For example, consider an e-commerce system that has a one-day sale on a popular product. During the sale, demand for that product (and its data in DynamoDB) would sharply increase, compared to all of the other products. To mitigate the impacts of a “hot” key and a non-uniform data distribution, you could offload the read activity to a DAX cache until the one-day sale is over.
Applications that are read-intensive, but are also cost-sensitive. With DynamoDB, you provision the number of reads per second that your application requires. If read activity increases, you can increase your table’s provisioned read throughput (at an additional cost). Alternatively, you can offload the activity from your application to a DAX cluster, and reduce the amount of read capacity units you’d need to purchase otherwise.
Applications that require repeated reads against a large set of data. Such an application could potentially divert database resources from other applications. For example, a long-running analysis of regional weather data could temporarily consume all of the read capacity in a DynamoDB table, which would negatively impact other applications that need to access the same data. With DAX, the weather analysis could be performed against cached data instead.
How It Works
What does DAX manage on my behalf?
Amazon DynamoDB Accelerator (DAX)
Amazon DynamoDB | Database
DAX is a fully-managed cache for DynamoDB. It manages the work involved in setting up dedicated caching nodes, from provisioning the server resources to installing the DAX software. Once your DAX cache cluster is set up and running, the service automates common administrative tasks such as failure detection and recovery, and software patching. DAX provides detailed CloudWatch monitoring metrics associated with your cluster, enabling you to diagnose and react to issues quickly. Using these metrics, you can set up thresholds to receive CloudWatch alarms. DAX handles all of the data caching, retrieval, and eviction so your application does not have to. You can simply use the DynamoDB API to write and retrieve data, and DAX handles all of the caching logic behind the scenes to deliver improved performance.
What kinds of data does DAX cache?
Amazon DynamoDB Accelerator (DAX)
Amazon DynamoDB | Database
All read API calls will be cached by DAX, with strongly consistent requests being read directly from DynamoDB, while eventually consistent reads will be read from DAX if the item is available. Write API calls are write-through (synchronous write to DynamoDB which is updated in the cache upon a successful write).
The following API calls will result in examining the cache. Upon a hit, the item will be returned. Upon a miss, the request will pass through, and upon a successful retrieval the item will be cached and returned.
- GetItem
- BatchGetItem
- Query
- Scan
The following API calls are write-through operations.
- BatchWriteItem
- UpdateItem
- DeleteItem
- PutItem
How does DAX handle data eviction?
Amazon DynamoDB Accelerator (DAX)
Amazon DynamoDB | Database
DAX handles cache eviction in three different ways. First, it uses a Time-to-Live (TTL) value that denotes the absolute period of time that an item is available in the cache. Second, when the cache is full, a DAX cluster uses a Least Recently Used (LRU) algorithm to decide which items to evict. Third, with the write-through functionality, DAX evicts older values as new values are written through DAX. This helps keep the DAX item cache consistent with the underlying data store using a single API call.
Does DAX work with DynamoDB GSIs and LSIs?
Amazon DynamoDB Accelerator (DAX)
Amazon DynamoDB | Database
Just like DynamoDB tables, DAX will cache the result sets from both query and scan operations against both DynamoDB GSIs and LSIs.
How does DAX handle Query and Scan result sets?
Amazon DynamoDB Accelerator (DAX)
Amazon DynamoDB | Database
Within a DAX cluster, there are two different caches: 1) item cache and 2) query cache. The item cache manages GetItem, PutItem, and DeleteItem requests for individual key-value pairs. The query cache manages the result sets from Scan and Query requests. In this regard, the Scan/Query text is the “key” and the result set is the “value”. While both the item cache and the query cache are managed in the same cluster (and you can specify different TTL values for each cache), they do not overlap. For example, a scan of a table does not populate the item cache, but instead records an entry in the query cache that stores the result set of the scan.
Does an update to the item cache either update or invalidate result sets in my query cache?
Amazon DynamoDB Accelerator (DAX)
Amazon DynamoDB | Database
No. The best way to mitigate inconsistencies between result sets in the item cache and query cache is to set the TTL for the query cache to be of an acceptable period of time for which your application can handle such inconsistencies.
Can I connect to my DAX cluster from outside of my VPC?
Amazon DynamoDB Accelerator (DAX)
Amazon DynamoDB | Database
The only way to connect to your DAX cluster from outside of your VPC is through a VPN connection.
When using DAX, what happens if my underlying DynamoDB tables are throttled?
Amazon DynamoDB Accelerator (DAX)
Amazon DynamoDB | Database
If DAX is either reading or writing to a DynamoDB table and receives a throttling exception, DAX will return the exception back to the DAX client. Further, the DAX service does not attempt server-side retries.
Does DAX support pre-warming of the cache?
Amazon DynamoDB Accelerator (DAX)
Amazon DynamoDB | Database
DAX utilizes lazy-loading to populate the cache. What this means is that on the first read of an item, DAX will fetch the item from DynamoDB and then populate the cache. While DAX does not support cache pre-warming as a feature, the DAX cache can be pre-warmed for an application by running an external script/application that reads the desired data.