System Design Concepts Flashcards
What is Horizontal Scaling?
Scaling by adding more servers to your pool of resources
What is the advantage of Horizontal Scaling?
Horizontal scaling is easier to dramatically scale
What is Vertical Scaling?
Scaling by adding more power (CPU, RAM, Storage) to existing server
What is a Load Balancer?
Helps spread traffic across a cluster of services in order to reduce individual server load, and prevent any one server from becoming a single point of failure.
LBs are intended to improve overall application availability and responsiveness
What type of scaling does Load Balancing introduce?
Horizontal Scaling
Where can you add Load Balancers?
Lots of places! Between the user and the web server, between web servers and internal platform layer (ex: application servers, cache servers) and between internal platform layer and database
What are the benefits of using a Load Balancer?
- Users experience faster, uninterrupted service (not waiting on a single server)
- Less downtime and higher throughput (even full server failure won’t impact end user)
- Fewer failed or stressed components since no single device is performing a lot of work
How does the Load Balancer choose a backend server?
- Health Checks: LBs should only forward traffic to healthy hosts
- Predetermined Algorithm
What is Least Connected Method, and when is it useful?
A LB method that directs traffic to the server with the fewest active connections
This is useful when there a large number of persistent client connections which are unevenly distributed between the servers
What is Least Response Time Method?
A LB method that directs traffic to the server with the fewest active connections and the lowest average response time
What is Least Bandwidth Method?
A LB method that selects the server that is currently serving the least amount of traffic measured in megabits per second (MBPs)
What is Round Robin Method and when is it useful?
A LB method that cycles through a list of servers and sends each new request to the next server. When it reaches the end it starts from the beginning
This is most useful when the servers are of equal specification and there are not many persistent connections
What is Weighted Round Robin Method and when is it useful?
A LB method in which each server is assigned a weight that indicates the processing capacity. Servers with higher weights receive new connections before those with less weights (and get more connections)
This is useful in the case where you have servers with different processing capabilities, and there are not many persistent connections
What is IP Hash Method?
A LB method that uses the hash of the IP address of the client to redirect the request to a server
What are Redundant Load Balancers?
The Load Balancer itself can be a single point of failure. To overcome this, a second load balancer can be connected to the first to form a cluster
Each LB monitors the health of the other, and since both are capable of serving traffic and failure detection, the second LB can take over if the main LB fails
What is Cache and what principal does it take advantage of?
Short term memory.
It has a limited amount of space, but is typically faster than the original data source.
Cache takes advantage of the locality of reference principal: recently requested data is likely to be requested again
What is Application Server Cache?
This involves placing a cache directly on a request layer node, to enable the local storage of response data
The cache on one request layer could be located both in memory (very fast) and on the node’s local disk (still faster than going to network storage)
How do you handle cache with multiple server nodes?
It is possible to have each node host its own cache
However, if your LB randomly distributes requests across the nodes, the same request will go to different nodes, thus increasing cache misses
Two choices for overcoming:
- Global caches
- Distributed caches
What is CDN?
Content Distribution Network
CDNs are a kind of cache that comes into play for sites serving large amounts of static media
Typical set up:
- Request asks CDN for piece of static media (file, image)
- CDN serves if available
- If not available, CDN queries BE server for the file, caches and serves
What is Cache Invalidation and why is it important?
Cache must be kept coherent with the source of truth (the database). If the data is modified in the database, it should be invalidated in the cache (removed or updated) otherwise this can cause inconsistent application behavior
What is Write Through Cache and what are its pros and cons?
Data is written to the cache and the corresponding database at the same time
Pros:
- Allows for fast retrieval and complete data consistency between cache and storage
- Minimizes risk of data loss
Cons:
- Higher latency for write operations (since done twice)
What is Write Around Cache and what are its pros and cons?
Data is written directly to permanent storage, bypassing the cache
Pros:
- Can reduce the cache being flooded with write operations that will not be re-read
Cons:
- Read requests for recently written data will create “cache misses” and must be read from storage – causing higher latency
What is Write Back Cache
Data is written to cache alone and completion is immediately confirmed to the client. The write to permanent storage is done after specified intervals or under certain conditions
Pros:
- Low latency and high throughput for write intensive apps
Cons:
- Comes with risk of data loss because the only copy of written data is in cache
What is FIFO Cache Eviction?
First In, First Out
The cache evicts the first block accessed first without any regard to how often or how many times it was accessed before
What is LIFO Cache Eviction?
Last In, First Out
The cache evicts the block accessed most recently without any regard to how often or how many times it was accessed before
What is LRU Cache Eviction?
Least Recently Used
The cache discards the least recently used items first
What is MRU Cache Eviction?
Most Recently Used
The cache discards the most recently used items first
What is LFU Cache Eviction?
Least Frequently Used
Counts how often an item is needed. Those that are used least often are discarded first
What is Random Replacement Cache Eviction?
Randomly selects a candidate item and discards it to make space when necessary
What is Data Partitioning?
The process of splitting up a database or table across multiple machines to improve the manageability, performance, availability and load balancing of an application
Claim is that after a certain scale point, it is cheaper and more feasible to scale horizontally by adding more machines than to grow it vertically by adding beefier servers
What is Horizontal Partitioning and what are its downsides?
Also called data sharding
Where you put different rows of the same table onto different machines
The key problem with this approach is that if the value whose range is used for partitioning isn’t chosen carefully, then the partitioning scheme will lead to unbalanced servers (ex: messages for IBM sharded by org ID)
What is Vertical Partitioning what what are its downsides?
Divide our data to store tables related to a specific feature on their own server – for example, storing user profile info on one DB, friends on another, and photos on a third
This is straight forward to implement and has a low impact on the application
The main problem is that if the application experiences additional growth, then it may be necessary to further partition a feature specific DB across various servers
What is Dictionary Based Partitioning?
A loosely coupled approach – create a lookup service which knows your current partitioning scheme and abstracts is away from the database access code
To find out where a particular data entry resides, we query directory server that holds the mapping between tuple key to db server
Means we can perform tasks like adding server or changing partitioning scheme without impacting the application
What is Key or Hash Based Partitioning?
Apply a hash function to key attribute of the entity we are storing (ex: 100 servers and id
% 100 yields the server number)
Ensures a uniform distribution of data among servers
Fundamental problem is that it effectively fixes total number of database servers since adding new servers means changing hashing function – this can be solved for by using consistent hashing
What is List Partitioning?
Each partition is assigned a list of values, so whenever we want to insert a new record, we will see which partition contains our key and store it there
What is Round Robin Partitioning?
Very simple strategy that ensures uniform data distribution
With “n” partitions, the “i” tuple is assigned to partition (i % n)
What is Composite Partitioning?
Combing other partitioning schemes to devise a new scheme (ex: first applying list partitioning scheme and then a hash based partition)
Consistent hashing could be considered a composite of hash + list partitioning where the hash reduces the key space to a size that can be listed
What are the common problems of data partitioning?
- Joins and Denormalization: not feasible to perform joins that span partitions (work around is to denormalize - but comes with risk of data inconsistency)
- Referential Integrity: normally cannot enforce data integrity constraints (such as foreign keys) across partitions. Must be enforced in the application code
- Rebalancing: many reasons we need to change our partitioning scheme which means moving data. Dictionary based partitioning helps but adds complexity to system and creates a single point of failure
What is a Proxy Server?
A proxy server is a piece of software or hardware that acts an intermediary from clients seeking resources from our servers
Proxies are used to filter requests, log requests or sometimes transform requests (adding/removing headers, encrypting/decrypting, compressing)
What is an Open Proxy?
A proxy server that is accessible by any internet user
Types:
- Anonymous proxy: reveals its identity as a server but does not disclose IP address
- Transparent proxy: identifies itself and with the support of HTTP headers, the 1st IP address can be viewed
What is a Reverse Proxy?
Retrieves resources on behalf of a client from one or more servers
These resources are then returned to the client, appearing as if they originated from the proxy server itself
What is Redundancy?
The duplication of critical components or functions of a system within the intention of increasing the reliability of a system, usually in the form of a back-up or fail safe
Plays a key role in removing the single points of failure in a system and provides backups if needed in a crisis
What is Replication?
Replication means sharing information to ensure consistency between redundant resources to improve reliability, fault-tolerance of accessibility
Replication is widely used in many DB management system, generally with a primary/copy relationship, where the primary gets all the updates and then ripples through the copies, which output messages indicating successful updates
What is a SQL Database and When Should You Use It?
Relational database that stores data in rows and columns
Each row contains all the information about one entity and each column contains all the separate data points
You should use a SQL DB if you:
- Need to ensure ACID compliance – reducing anomalies and protecting the integrity of your database by prescribing exactly how transactions interact with the database
- Your data is structured and unchanging - if your business is not experiencing massive growth that would require more servers and if you’re only working with data that is consistent
What is a NoSQL Database and When Should You Use It?
Non-relational databases are unstructured, distributed and have a dynamic schema, like file folders
You should use a NoSQL DB if you:
- Are storing large volumes of data that has little to no structure
- Want to make the most of cloud computing and storage
- Are rapidly developing. NoSQL doesn’t need to be prepped ahead of time, and makes it easy if you are making quick iterations which require frequent updates to the data structure
NoSQL vs SQL: Storage
SQL: stores data in tables where each row represents an entity and each column represents a data point about that entry
NoSQL: has different storage models. Main ones are key-value, document, graph and columnar
NoSQL vs SQL: Querying
SQL: each record conforms to a fixed schema, meaning columns must be decided and chosen before data entry and each row must have data for each column. Schema can be altered but it includes modifying whole database and going offline
NoSQL: schemas are dynamic. Columns can be added on the fly and each “row” equivalent doesn’t have to contain data for each “column”
NoSQL vs SQL: Scalability
SQL: In most situations SQL databases are vertically scalable in terms of hardware – which can get expensive. You can also horizontally scale, but that is also expensive both in terms of cost and complexity
NoSQL: Horizontally scalable, meaning we can add more servers easily. A lot of NoSQL technologies also distribute data across servers automatically
NoSQL vs SQL: Reliability
SQL: the vast majority of SQL databases are ACID compliant, so when it comes to data reliability and safe guarantee of performing transactions, SQL databases are a better bet
NoSQL: most NoSQL solutions sacrifice ACID compliance for performance and scalability
What is CAP Theorem?
States it is impossible for a distributed software system to simultaneously provide more than 2 of the following:
- Consistency: all nodes see the same data at the same time (achieved by updating several nodes before reads)
- Availability: every request gets a response on success/failure (achieved to replicating data across different servers)
- Partition Tolerance: system continues to work despite message loss or partial failure (achieved by data being sufficiently replicated across combos of nodes and networks)
What is Consistent Hashing?
A distributed hashing scheme that operates independently of the number of servers or objects in a distributed hash table, by assigning them a partition on an abstract circle, or hash ring
Allows adding new hosts without migrating entire partitioning scheme
What is the regular HTTP Request Sequence?
- Client opens a connection and requests data from server
- Server calculates the response
- Server sends the response back to the client
What is Ajax Polling HTTP Request Sequence?
- The client repeatedly polls (or requests) a server for data
- The client makes request and waits for server to respond – if no data is available, empty response is returned
- This repeats at regular intervals
The downside to this sequence:
- Client has to keep asking the server for new data
- As a result, a lot of responses are empty, creating HTTP overhead
What is HTTP Long Polling?
A variation of the typical polling technique that allows the server to push information to a client whenever the data is available
Same as normal polling, but expectation that the server may not respond immediately (“hanging GET”)
Once delivered, client will immediately re-request info from the server so it always has available request
What are websockets?
Provide a persistent connection between a client and a server that both parties can use to start sending data at any time
Websocket protocol enables communication between a client and a server with lower overheads, facilitating real time data transfer from and to the server
What are Server Sent Events?
Under SSEs the client establishes a persistent and long term connection with the server
The server uses this connection to send data to a client – but if the client wants to send data to the server, it would require the use of another protocol/technique to do so
Best when we need real-time traffic from the server to the client
What is Latency?
Latency is commonly understood to be the “round trip” of the network request
latency is the inverse of speed, you want higher speeds and you want lower latency
What is Throughput?
This can be understood as the maximum capacity of the machine or system
A system is only as fast as its slowest bottleneck
You can increase throughput by buying more hardware (horizontal scaling) or increasing the capacity and performance of your existing hardware (vertical scaling)
Think about ways to scale the throughput of a given system including by splitting up load, and distributing them across other resources, etc