Caching Flashcards
What is caching?
A high-speed data storage layer that stores a subset of data (that is typically transient) so that future requests may be faster than accessing the primary storage layer. Effectively re-uses previously retrieved or computed data.
Why use caching?
- Reduce latency of an expensive computation/call
- In cloud computing it can be used to reduce costs (Less expensive to retrieve from cache than primary storage)
What are the two popular caching patterns?
-
Cache-aside pattern - First try to fetch the data from the cache. If its not there, fetch from the primary store and cache the results.
2.Write Through and Write Back patterns - Write directly to the cache, and the cache will synchronously Write Through or asynchronously Write back
What are the pros/cons of Cache-aside pattern?
- Pro - Only cache the data we need
- Con - Data may be stale on datasets that update frequently (Can combat this with a TTL)
- Con - When there are allot of cache misses, its more work overall and the cache has a negative impact. (So it works better when the results/data are more likely to be needed multiple times)
What are the pros/cons of Write-Through and Write-Back caching patterns?
- Pro - Write Back does not have any additional latency for cache hits or misses.
- Con - For either, the data written to the cache may not be needed/read again.
What is cache invalidation and how can you address it?
When data in a cache becomes stale (Due to the primary storage having updates not reflected in the cache)
It can be addressed by:
1. Least Recently Used - Evict records ordered by the last time they were used.
2. FIFO - First in - First Out
3. TTL - Create a time to live, evict items once they have expired.