System Design Flashcards
What questions would you ask before starting your design?
- Are we focusing on the backend only or are we developing the front-end too?
- What are we storing (images, videos, text)?
- Do we need to search?
- What scale is expected from the system?
- How much storage?
- What network bandwidth is needed?
- What are the expected APIs? Examples? inputs/outputs
- What kind of Database will be used?
What components would you use in a block diagram
- Client
- Load Balancer or Reverse Proxies
- Application Server(s)
- Database
- File Storage
What are some bottlenecks to consider when designing your architecture
- Are there Single points of failure and how to mitigate it
- Is there enough Data Replication?
- Are there enough copies of services?
- How to handle performance monitoring of services? Alerts?
What are the key characteristics of Distributed Systems
- Scalability
- Reliability
- Availability
- Efficiency
- Serviceability or manageability
What are the benefits of load balancing
- Faster uninterrupted service
- Less downtime and higher throughput
- Easier to handle incoming requests
- Fewer failed or stressed components
- predictive analytics
How does the load balancer choose the backend server?
- first ensure that the server they choose is actually responding appropriately to requests
- use a pre-configured algorithm to select one from the set of healthy servers
What are load balancing methods?
- Least Connection Method
- Least Response Time Method
- Least Bandwidth Method
- Round Robin Method
- Weighted Round Robin Method
- IP Hash: a hash of the IP address of the client is calculated to redirect the request to a server
List the types of caches in system architecture
- Application server: Placing a cache directly on a request layer node enables the local storage of response data
- Content Distribution Network ( CDN): a kind of cache that comes into play for sites serving large amounts of static media
What are the cache invalidation schemes
If the data is modified in the database, it should be invalidated in the cache; if not, this can cause inconsistent application behavior.
- Write-through cache: data is written into the cache and the corresponding database at the same time
- Write-around cache: data is written directly to permanent storage, bypassing the cache
- Write-back cache: data is written to cache alone and completion is immediately confirmed to the client.
What is database partitioning?
RDBMS (SQL)
- Partitioning is the database process where very large tables are divided into multiple smaller parts. By splitting a large table into smaller, individual tables, queries that access only a fraction of the data can run faster because there is less data to scan.
- Vertical partitioning involves creating tables with fewer columns and using additional tables to store the remaining columns.
- Horizontal Partitioning (sharding) stores rows of a table in multiple database clusters.
- Sharding makes it easy to generalize our data and allows for cluster computing (distributed computing). Sharding is needed if a data set is too large to be stored in a single DB. Most importantly, sharding allows a DB to scale in line with its data growth. It also reduces table size (index size more specifically) which improves search performance.
What are some common problems of DB data partitioning?
- Joins and Denormalization: Performing joins on a database which is running on one server is straightforward, but once a database is partitioned and spread across multiple machines it is often not feasible to perform joins that span database partitions.
- Referential integrity: enforcing data integrity constraints such as foreign keys in a partitioned database can be extremely difficult.
- Rebalancing. Reasons for rebalancing:
a) data distribution is not uniform
b) There is a lot of load on a partition
What is the goal of creating an index on a particular table in a database?
make it faster to search through the table and find the row or rows that we want. Indexes can be created using one or more columns of a database table, providing the basis for both rapid random lookups and efficient access of ordered records.
What is a forward Proxy
A proxy server is an intermediate server between the client and the back-end server. Clients connect to proxy servers to make a request for a service like a web page, file, connection, etc. In short, a proxy server is a piece of software or hardware that acts as an intermediary for requests from clients seeking resources from other servers.
- Used to bypass firewall restrictions
What are Proxy Types
- Open Proxy: accessible by any Internet user.
- Reverse Proxy: retrieves resources on behalf of a client from one or more servers.
What are the two famous open proxy types
Anonymous Proxy - Thіs proxy reveаls іts іdentіty аs а server but does not dіsclose the іnіtіаl IP аddress. Though thіs proxy server cаn be dіscovered eаsіly іt cаn be benefіcіаl for some users аs іt hіdes their IP аddress.
Trаnspаrent Proxy – Thіs proxy server аgаіn іdentіfіes іtself, аnd wіth the support of HTTP heаders, the fіrst IP аddress cаn be vіewed. The mаіn benefіt of usіng thіs sort of server іs іts аbіlіty to cаche the websіtes.