Caching and more Flashcards
Caching
- It’s a piece of hardware/software that stores data, typically meant to retrieve that data faster than otherwise
- Cache misses are typically the consequence of a system failure or a poor design choice
- Cache eviction policies (defines which values get removed from a cache):
* LRU (least recently used): discards the least recently used items first
* FIFO (first in first out)
* LFU (least-frequently used): counts how often all items are needed. Those that are used least often are discarded first
CDN (Content Delivery Network)
- It’s a third-party service that acts as a cache for web servers
- CDN’s servers are referred to as POPs (Points of Presence)
- Tools: Cloudflare, Google Cloud CDN, AWS CloudFront
Proxies
Forward Proxy:
- It’s a server that sits between client and server and acts on behalf of the client
- Typically used to mask the client’s identity (IP address)
Reverse proxy:
- It’s a server that sits between client and server and acts on behalf of the server
- Typically used for logging, load balancing, or caching
- Tools: Nginx
Load Balancers - Basics
- A type of reverse proxy that distributes traffic across servers
- Hot spot:
- When distributing a workload across a set of servers, that workload spreads unevenly
- This can happen if the ‘sharding key’ or ‘hashing function’ are suboptimal, or if the workload is naturally skewed
- Tools: Nginx
Load Balancers - Selection strategy
- Round-robin: distributes traffic through all the servers in sequential turns
- Random selection
- Performance-based selection: chooses the server with the fastest response time or least amount of traffic
- IP-based selection: routes requests from the same client to the same server
- Path-based selection: routes requests according to the URL path to a specific server
Hashing - Hashing Function and Consistent Function
Hashing function:
- A function that takes in a specific data type and outputs a number
- Different inputs may have the same output. A good hashing function attempts to minimize hashing collisions
Consistent function:
- Minimizes the number of keys that need to be remapped when a hash table gets resized
- Used by LBs to distribute traffic to servers. It minimizes the requests forwarded to servers when new servers are added or brought down
Hashing - Rendezvous hashing Function and SHA
Rendezvous Hashing:
- Called Highest Random Weight Hashing
- Allow for minimal re-distribution of mappings when a server goes down
SHA (Secure Hash Algorithms):
- Collection of cryptographic hash functions used in the industry
- SHA-3 is a popular choice to use