System Design Flashcards
What are CRUD operations?
Create, Read, Update, Delete
Often serves as the bedrock of a functioning system, thus at the core of many APIs
What is pagination and why is it necessary?
When a network request warrants a large response, the API might be designed to return a limited portion of that response accompanied by an identifier for the client to request the next page if desired
Often used when designing API list endpoints
What is an ACL?
Access Control List
Refers to a permissioning model about which users in a system can perform which operations
What is a DDoS attack?
Distributed Denial of Service attack
A DoS attack where the traffic flooding the target system comes from many different sources making it much harder to defend against
What is a DoS attack?
Denial of Service attack
Attack where a malicious user tries to bring down or damage a system in order to render it unavailable to users, often by flooding the target system
What is rate limiting?
The act of limiting the number of requests sent to/from a system
Often used to prevent DoS attacks
What is streaming in terms of networking?
The act of continuously getting a feed of information from a server by keeping an open connection between the two machines
What is polling?
The act of fetching data regularly at an interval to make sure the data is not too stale
What is a socket?
A kind of file that acts like a stream
Processes can read/write to sockets and communicate in this manner
Often fronts for TCP connection
What is the gossip protocol?
When a set of machines talk to each other in an uncoordinated manner in a cluster to spread information through a system without requiring a central source of data
What is a peer-to-peer network?
A collection of machines referred to as peers that divide workload between themselves to complete the workload faster
Often used in file distribution systems
What is blob storage?
Widely used storage that only allows the user to store and retrieve data based on the name of the blob
ie. GCS and AWS S3
What is a key value store?
A flexible NoSQL db often used for caching and dynamic configuration
Examples:
- Etcd
- Zookeeper
- Redis
What is the difference between strong consistency vs eventual consistency?
Strong consistency
- Refers to consistency of ACID transactions
Eventual consistency
- Database reads may return stale data
- An eventually consistent database gives guarantees that the state of the db will eventually reflect writes within a certain time period
What is ACID transaction?
A type of db transaction that has the following properties
Atomicity
- The operations that constitute the transaction will either all succeed/fail
Consistency
- Transaction cannot bring db to an invalid state
- After the transaction is committed/rolled back, the rules for each record will still apply
Isolation
- Execution of multiple transactions concurrently will have the same effect as if they had been executed sequentially
Durability
- Any committed transaction is written to non-volatile storage; not be undone by a crash, power loss or network partition
What is the difference between relational vs non-relational databases?
Relational db
- Structured db where data is stored following a tabular format
- Often referred to as SQL dbs
- Often supports powerful querying using SQL
Non-relational db
- Database that is free from imposed, tabular-like structure
- Often referred to as NoSQL dbs
What are the different ways that data can be stored? Explain the difference
Disk
- Data will persist through power failures and machine crashes
- Can refer to either HDD or SSD
- AKA persistent storage
Memory
- Refers to RAM (Random Access Memory)
- Data stored in memory will be lost through power failures and machine crashes
What are databases?
Programs that either use disk/memory to store/query data
In general, they are themselves servers that are long-lived and interact with the rest of your application through network calls, with protocols on top of TCP or HTTP
What is the single responsibility principle?
A single component having one responsibility and executing it perfectly
This approach provides flexibility and makes management easier
What is separation of concerns?
Keeping components separate/loosely-coupled makes them reusable
This approach makes scaling the service easier
What is client-server architecture?
Uses the request-response model
The client sends a request to the server for information and the server responds with it
What is a client?
A machine/process that requests data from a server
A single machine can be both client and server at the same time
ie. act as a server for users and a client for a db
ie. web app, mobile app, web-based console running commands to interact with the backend server
What is a server?
A machine/process that provides data for a client, usually by listening to incoming network calls
A single machine can be both client and server at the same time
ie. act as a server for users and a client for a db
ie. app server, proxy server, mail server, file server
What is an IP address?
An address given to each machine connected to public internet
Special values
- Localhost: 127.0.0.1
- Your private network: 192.168.x.x
Why are there different ports?
In order for multiple programs to listen for new network connections on the same machine without colliding, they pick a port to listen on
Common ports and their uses
- 22: SSH
- 53: DNS lookup
- 80: HTTP
- 443: HTTPS
What is TCP?
Network protocol built on top of IP
Allows for ordered, reliable data delivery between machines over the internet by creating a connection
TCP is usually implemented in the kernel, which exposes sockets to applications that they can use to stream data through an open connection
What is HTTP?
HyperText Transfer Protocol
Common network protocol implemented on top of TCP
What is an IP network packet?
Data being sent over IP
Consists of:
- IP header: contains source and destination IP addresses, and other network-related information
- Payload: data being sent
What are application servers?
Servers that run web apps
What is a forward proxy?
A server that sits between a client and server and acts on behalf of the client
Typically used to mask the client’s identity (IP address)
What is a reverse proxy?
A server that sits between clients and servers and acts on behalf of the servers
Typically used for logging, load balancing and caching
ie. Nginx is a popular web server often used as a reverse proxy and load balancer
What is an API gateway?
Single entry point into the system
Handles all client requests, taking care of authorisation/authentication, sanitising input data and other necessary tasks before providing access to application resources
What is the HTTP pull mechanism?
Default mode of HTTP communication
The clients pull data from the server whenever required - every request/response consumes bandwidth and every hit on the server costs the business money and adds more load on the server
What is the HTTP push mechanism?
Clients sends a request for a particular information to the server once and after the initial request, the server keeps pushing new updates to the client whenever they are available
ie. web sockets, message queues, streaming over HTTP
What is scalability?
Ability of an application to handle and withstand increased workload without sacrificing latency
Scalability in terms or Big O notation should be (O)1 constant time
ie. if an app takes x seconds to respond to a user request, then it should take the same x seconds to respond to each concurrent user requests on the app
What is latency?
The time it takes for a certain operation to complete
Latency is generally divided into:
- Network latency
- Application latency
What is throughput?
The number of operations that a system can handle properly per time unit
eg. the throughput of a server can be measured in requests per second
What are some orders of magnitude in terms of latency?
Reading 1MB from RAM = 0.25 ms Reading 1MB from SSD = 1 ms Reading 1MB from HDD = 20 ms Transfer 1MB over network = 10 ms Inter-continental round trip = 150 ms
What is network latency?
Amount of time that the network takes to send a data packet from point A to point B - the network should be sufficient enough to handle the increased traffic load
What strategies can be employed to reduce network latency?
- Use CDN
- Deploy servers across the globe as close to the end user as possible
What is application latency?
Amount of time the application takes to process a user request
What strategies can be employed to reduce application latency?
- Run load/stress tests to scan for bottlenecks
What are the different ways of scaling an application?
- Vertical scaling
- Horizontal scaling
What is a persistent connection?
A network connection that exists between client and server that remains open for further requests and responses as opposed to being closed after a single communication
The connection stays open with the help of heartbeat interceptors
What is a heartbeat interceptor?
Blank request/responses between client and server to prevent the browser from closing the connection after the TTL time
Persistent connection: pros
- Prevent browser from closing the connection after the TTL time
- When frequency of request/response is high, a persistent connection averts the need to open and close a new connection
Persistent connection: cons
- Open connection consume resources
- There is a limit to the number of open connections a server can handle at once
- If connections don’t close and new ones are introduced, then the server will run out of memory over time
What are web sockets?
Network protocol that allows two-way communication session between the client and server
A type of HTTP push based mechanism
Preferred when a persistent bi-directional, low latency data flow is needed
ie. chat apps, real-time social streams, bitcoin trading sites, browser-based multiplayer games
What is long polling?
Technique to emulate a realtime server push feature
In long polling, the client requests are long-lived and the connection hangs until the server responds with data or a timeout threshold is reached
Reduces consumption bandwidth compared to polling as there are a smaller number of client requests sent
What are server-sent events? What are some use cases?
The server automatically pushes data to the client whenever updates are available
Reduces a huge amount of blank request-response cycles
Uni-directional data flow from server to client as opposed to web sockets having bi-directional data flow
ie. Twitter feed, realtime notifications
Why do browser requests have a TTL?
Because open connections between client and server are resource-intensive and there is a limit to the number of connections a server can manage
What is the difference between client-side and server-side rendering?
Client-side rendering
- The rendering engine (a component on the browser) constructs the DOM tree, renders and paints the construction
- Used for dynamic and AJAX-based websites
Server-side rendering
- Rendering the UI on the backend server and then sending the rendered HTML page to the client
- Ensures faster rendering of the UI
- Great for SEO because web crawlers can easily read the generated content
- Used for static content (ie. Wordpress blogs)
What is vertical scaling? What are some use cases?
Increasing the power of the hardware running the app (adding more power to the server)
ie. a server with 16GBs of RAM and to handle the increased load, you scale up to 32GBs of RAM
Use cases:
- Minimal, consistent traffic
- Internal app for the organisation
Vertical scaling: pros
- Simpler than horizontal scaling because we don’t need to touch the code or make any complex system configurations
Vertical scaling: cons
- Availability risk - the servers are powerful but few in number
- There is a limit to the capacity we can increase for a single server (ie. for a multi-storey building, you can’t infinitely add more floors but you can build more buildings)
- Unable to dynamically scale in realtime as it requires pre-planning
What is horizontal scaling? What are some use cases?
Increasing more hardware to the existing resource pool to add more computational power to the system as a whole
Use cases:
- Social network app where traffic is expected to spike significantly
Horizontal scaling: pros
- No limit to augmenting the hardware capacity
- Servers can be setup across the globe so data can be replicated across different geoghical regions
- Able to dynamically scale in realtime as traffic increase/decrease
Horizontal scaling: cons
- Takes more administrative, monitoring and management effort to managing a distributed environment
- Code needs to change - can’t have any static instances in the class as static instances hold application data so if a particular server goes down, all static data is lost and app is left in an inconsistent state (this is why functional programming became popular with distributed systems as functions don’t retain any state)
What does cloud elasticitiy mean?
Ability to add/remove server nodes dynamically
Indicates the stretching and returning to the original infrastructural computational capacity
What are the primary bottlenecks that hurt scalability?
- Database is a single monolith
- Db partitioning, sharding can make the app efficient
- Database containing business logic
- Unnecessary load on the db
- Whole app becomes tightly coupled
- Not picking the right database
- Application architecture
- Not using caching wisely
- Inefficient setup of load balancers
- Too many/too few impacts latency
- Inefficient code
- Tightly coupled code
- Unnecessary/nested loops
What strategies can be employed to fine tune the performance of an app?
- Profiling
- Caching
- CDN
- Data compression
- Avoid unnecessary client-server requests
How is the app’s performance and scalability related?
Performance of an app is directly related to scalability - if an app does not scale well, the increase in traffic will reduce the app’s performance significantly
What is availability?
The odds of a server/service being up and running at any point in time
Usually measured in percentages
ie. a server that has 99% availability will be operational 99% of the time
What are nines in terms of availability?
Refers to percentage of uptime
Expected downtimes per year for uptime percentages:
- 99% (two nines): 87 hrs
- 99.9% (three nines): 8 hrs
- 99.99% (four nines): 52 mins
- 99.999% (five nines): 5 mins
What is meant by high availability?
Systems that have typically > five 9s
What are some common reasons for failing systems?
- Software crashes
- OS crashing
- Memory-hogging unresponsive processes
- Overloaded CPU and RAM
- Hardware failures
- Hard disk failures
- Human errors
- Flawed config
- Planned downtime
- Routine maitenance
- Patching software/hardware upgrades
What strategies can be employed to achieve high availability?
- Fault tolerance
- Microservices
- Redundancy
- Replication
What is fault tolerance?
A system’s ability to stay up despite taking hits
What is redundancy?
The process of replicating parts of a system to make it more reliable
Ensures the system as a whole remains unimpacted
What is replication?
The act of duplicating data from one database to others
Uses:
- Increase system redundancy
- Tolerate regional failures
- Move data closer to clients to decrease latency
What is sharding or data partitioning?
The act of splitting a database into >= 2 more pieces (shards) based on different strategies
Used to increase throughput of the db
What are some sharding strategies?
- Based on client’s region
- Based on type of data being stored
(ie. payments data in one shard, user data in another shard) - Based on the hash of a column (for structured data)
What is a CDN?
Content Delivery Network
Third-party service that acts like a cache for your servers
A CDN has servers across the globe so the latency to a CDN will almost always be better than the latency to your servers
What is a message queue?
Queue that routes messages from the source to the destination
Follows FIFO policy
Use cases:
- Notification systems
What are the different patterns for message queues?
- Pub/sub model
- Point-to-point model
Message queue: pros
- Facilitates async behaviour
- Facilitates cross-module communication with each other in the background
- This is key in service-oriented and microservices architecture
- Provide temporary storage of messages until they are consumed
What is the pub/sub model in a message queue?
Model where multiple subsribers receive the same message sent from a publisher
One-to-many relationship - one producer, multiple consumers
What is the point-to-point model in a message queue?
Model where the message from one producer is consumed by only one consumer
One-to-one relationship - one producer, one consumer
Why is a pull-based approach not ideal in implementing the user notifications feature?
Because a pull-based approach is resource intensive - application servers have to deal with unnecessary requests and it is also not in realtime