System Design Week 2 - Distributed Cache Flashcards
Problem 1) Distributed Cache
Question 1) List down the functional requirements
- put(key, value)
- get(key)
- Support LRU Eviction policy (or make it configurable)
- Support persistence.
- Ignore security
Problem 1) Distributed Cache
Question 2) List down the non functional requirements
- Scalable
- Available
- High Performance
Problem 1) Distributed Cache
Question 3) List down the microservices covering all functional requirements
- Cache client - Not really a service, but a library
- Cache Server - A collection of server components that manage the cache.
Problem 1) Distributed Cache
Question 4) Logical Diagram & flow of data
Diagram: https://drive.google.com/file/d/1H_oF4bNQDzG57k0cl2MPgun4gLNbv_U-/view?usp=sharing
Problem 1) Distributed Cache
Question 5) Schema Design
Concrete schema is not required here.
Data persistence :
Main memory is volatile, so we need to store the data on the
secondary memory.
● Two common techniques
- Append only file/logs
■ Data is written to disk on every write
■ Data is written to disk at a fixed time interval (1 sec)
- Snapshot
■ Snapshot of entire data is created at configurable time
Problem 1) Distributed Cache
Question 6) API Design
- put(key, value)
- get(key)
Problem 1) Distributed Cache
Question 7) Business logic for the problem statement
What is LRU ?
→ A combination of Doubly Linked List and a Hash Table, or a LinkedHashMap
● DLL nodes will have the key
● On access
- Move the key to the head of the DLL
● To evict
- Remove the key from the tail
- Delete from the Hash Table
● Update existing value
- Update the Hash Table
- Move the key to the head of the DLL
● Adding a new key
- Add to Hash Table
- Add the key to the head of the DLL
Problem 1) Distributed Cache
Question 8) Design Consideration for the problem statement
https://drive.google.com/file/d/1fN2-Lui7LEgOg89PfgGhHYbNsS1hpwQs/view?usp=sharing
and
https://drive.google.com/file/d/1BJTqYpEEweBbf4gl_2KuE3LBHalZhlHd/view?usp=sharing
A deterministic set of reasons can be posed across all interviews
○ Need to scale for storage (storage and cache tiers) - Yes
○ Need to scale for throughput (CPU/IO) - Yes
○ Need to scale for API parallelization - No
○ Need to remove hotspots - Maybe
○ Availability and Geo-distribution - Yes
Problem 2) How to Scale
Question 1) Replication
- Copy the data on multiple read replicas
- All writes could be handled by primary, reads could be handled by any replicas
- Strong consistency or Weak/Eventual consistency?
■ Once write is done by primary, when do you send ack back to the user would
change based on this.
■ Additional complexity of 2 or 3 phase commit or paxos for strong
consistency.
Problem 2) How to Scale
Question 2) Sharding
Distribute data across multiple cache servers
- Consistent hashing or equivalent techniques to distribute keys to hosts.
- How do the clients know which server to get the data from?
■ Central router
● Client connects to central router
● Router knows which node has which keys, routes the request
■ Decentralized
● Client knows all servers (but not the mapping of keys)
● Client picks a server randomly or round robin etc
● Every server knows which keys maps to which node, routes the
requests appropriately.