Ststems Design: Basics, Load Balancing, Caching Flashcards

1
Q

What are functional vs. non functional requirements?

A

Functional requirements are the requirements that define what a system is supposed to do. They describe the various functions that the system must perform.

Non functional requirements describe how the system performs a task, rather than what tasks it performs. They are related to the quality attributes of the system.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What kinds of estimations might you need to make in a systems design interview?

A

In system design interviews, there are several types of estimations you may need to make:

Load estimation: Predict the expected number of requests per second, data volume, or user traffic for the system.

Storage estimation: Estimate the amount of storage required to handle the data generated by the system.

Bandwidth estimation: Determine the network bandwidth needed to support the expected traffic and data transfer.

Latency estimation: Predict the response time and latency of the system based on its architecture and components.

Resource estimation: Estimate the number of servers, CPUs, or memory required to handle the load and maintain desired performance levels.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Suppose you’re asked to design a social media platform with 100 million daily active users (DAU) and an average of 10 posts per user per day. To estimate the load, you’d calculate the total number of posts generated daily:

A

100 million DAU * 10 posts/user = 1 billion posts/day

1 billion posts/day / 86,400 seconds/day ≈ 11,574 requests/second

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Storage Estimation

Consider a photo-sharing app with 500 million users and an average of 2 photos uploaded per user per day. Each photo has an average size of 2 MB. To estimate the storage required for one day’s worth of photos, you’d calculate:

A

500 million users * 2 photos/user * 2 MB/photo = 2,000,000,000 MB/day

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Bandwidth Estimation

For a video streaming service with 10 million users streaming 1080p videos at 4 Mbps, you can estimate the required bandwidth:

A

10 million users * 4 Mbps = 40,000,000 Mbps

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Latency Estimation

Suppose you’re designing an API that fetches data from multiple sources, and you know that the average latency for each source is 50 ms, 100 ms, and 200 ms, respectively. If the data fetching process is sequential, you can estimate the total latency as follows:

A

50 ms + 100 ms + 200 ms = 350 ms

If the data fetching process is parallel, the total latency would be the maximum latency among the sources:

max(50 ms, 100 ms, 200 ms) = 200 ms

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Resource Estimation

Imagine you’re designing a web application that receives 10,000 requests per second, with each request requiring 10 ms of CPU time. To estimate the number of CPU cores needed, you can calculate the total CPU time per second:

A

10,000 requests/second * 10 ms/request = 100,000 ms/second

Assuming each CPU core can handle 1,000 ms of processing per second, the number of cores required would be:

100,000 ms/second / 1,000 ms/core = 100 cores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Designing a messaging service estimation example.

A

Number of users: Estimate the total number of users for the platform. This can be based on market research, competitor analysis, or historical data.

Messages per user per day: Estimate the average number of messages sent by each user per day. This can be based on user behavior patterns or industry benchmarks.

Message size: Estimate the average size of a message, considering text, images, videos, and other media content.

Storage requirements: Calculate the total storage needed to store messages for a specified retention period, taking into account the number of users, messages per user, message size, and data redundancy.

Bandwidth requirements: Estimate the bandwidth needed to handle the message traffic between users, considering the number of users, messages per user, and message size.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Designing a video streaming platform

A

Number of users: Estimate the total number of users for the platform based on market research, competitor analysis, or historical data.

Concurrent users: Estimate the number of users who will be streaming videos simultaneously during peak hours.

Video size and bitrate: Estimate the average size and bitrate of videos on the platform, considering various resolutions and encoding formats.

Storage requirements: Calculate the total storage needed to store the video content, taking into account the number of videos, their sizes, and data redundancy.

Bandwidth requirements: Estimate the bandwidth needed to handle the video streaming traffic, considering the number of concurrent users, video bitrates, and user locations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

When designing a large system, what things do you need to consider?

https://www.designgurus.io/course-play/grokking-the-system-design-interview/doc/system-design-basics

A
  • What are the different architectural pieces that can be used?
  • How do these pieces work with each other?
  • How can we best utilize these pieces: what are the right tradeoffs?
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the key characteristics of distributed systems?

A

Scalability, Reliability, Availability, Efficiency, and Manageability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is scalability?

A

Scalability is the capability of a system, process, or a network to grow and manage increased demand. Any distributed system that can continuously evolve in order to support the growing amount of work is considered to be scalable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is reliability?

A

Reliability refers to the ability of a system to continue operating correctly and effectively in the presence of faults, errors, or failures. In simple terms, a distributed system is considered reliable if it keeps delivering its services even when one or several of its software or hardware components fail.

A related concept is Fault Tolerance, which is the system’s ability to continue operating (possibly at a reduced level) even when one or more of its components fail. In other words, it is the property that allows a system to absorb or recover from faults without total breakdown.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Reliability vs Fault Tolerance

A

Scope:

Reliability focuses on the end-to-end correctness and consistency of the entire system’s operation over time.
Fault tolerance focuses on the system’s ability to continue operating when individual components fail.

Perspective:

Reliability is primarily a user-centric concept: Can the system consistently meet the user’s expectations over time?
Fault tolerance is more of a system-centric concept: How does the system handle internal failures or component breakdowns?

Measurement:

Reliability is often measured in terms of uptime, error rates, or mean time between failures (MTBF).
Fault tolerance is often measured by how quickly and effectively the system detects, isolates, and recovers from failures (e.g., failover times).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is efficiency?

A

Two standard measures of its efficiency are the response time (or latency) that denotes the delay to obtain the first item and the throughput (or bandwidth) which denotes the number of items delivered in a given time unit (e.g., a second)

These corresponding to the following two unit costs:
* Number of messages globally sent by the nodes of the system regardless of the message size.
* Size of messages representing the volume of data exchanges.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is availability?

A

By definition, availability is the time a system remains operational to perform its required function in a specific period.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Serviceability or Manageability

A

Serviceability or manageability is the simplicity and speed with which a system can be repaired or maintained

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What layer does AWS’s ALB operate on and what is its use case?

A

Layer 7 - Application Layer of the OSI model. Designed for HTTP and websockets traffic.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is AWS’s Elastic Load Balancer?

A

Elastic Load Balancer (ELB)
This is the umbrella term for AWS’s load balancing service, which includes the Application Load Balancer (ALB), Network Load Balancer (NLB), and Gateway Load Balancer (GLB). Initially, it referred to the Classic Load Balancer (CLB), which is now deprecated for new deployments.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What layer does the Network Load Balancer work at and what are its use cases?

A

Layer: Operates at Layer 4 (Transport Layer of the OSI model).

Use Case: Designed for TCP/UDP and TLS traffic with ultra-high performance and low latency requirements.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is OSI and what are its seven layers?

A

OSI (Open Systems Interconnection) model:
Layer 7: Application Layer
Layer 6: Presentation Layer
Layer 5: Session Layer

Layer 4: Transport Layer

Layer 3: Network Layer

Layer 2: Data Link Layer

Layer 1: Physical Layer

22
Q

What are the different algorithms load balancers can use to determine where to direct traffic?

A

Least Connection Method
Least Response Time Method
Least Bandwidth Method measured in Mbps
Round Robin Method
Weighted Round Robin Method
IP Hash

https://www.designgurus.io/course-play/grokking-the-system-design-interview/doc/load-balancing

23
Q

What principle do caches make use of?

A

Locality of reference principle: recently requested data is likely to be requested again.

24
Q

What kinds of cache implementations are there?

A

In-memory caching ( stores data in the main memory of the computer, which is faster to access than disk storage), disk caching (stores data on the hard disk, which is slower than main memory but faster than retrieving data from a remote source), database caching (stores frequently accessed data in the database itself, reducing the need to access external storage), and CDN caching (stores data on a distributed network of servers, reducing the latency of accessing data from remote locations)

https://www.designgurus.io/course-play/grokking-the-system-design-interview/doc/caching

25
Q

What is a cache?

A

A temporary storage location for data or computation results, typically designed for fast access and retrieval.

26
Q

What is a Cache hit?

A

When a requested data item or computation result is found in the cache.

27
Q

What is a cache miss?

A

When a requested data item or computation result is not found in the cache and needs to be fetched from the original data source or recalculated.

28
Q

What is cache eviction?

A

The process of removing data from the cache, typically to make room for new data or based on a predefined cache eviction policy.

29
Q

What is cache staleness?

A

When the data in the cache is outdated compared to the original data source.

30
Q

What is in-memory caching commonly used for and how might it be implemented?

A

Caching API responses, session data, and web page fragments using a cache library like Memcached or Redis, or implementing custom caching logic within the application code

https://www.designgurus.io/course-play/grokking-the-system-design-interview/doc/caching

31
Q

Web server vs app server

A

static content, proxy server, caching, load balancing maybe?

32
Q

What is disk caching and why would you use it?

A

Disk caching is useful for data that is too large to fit in memory or for data that needs to persist between application restarts. This type of caching is commonly used for caching database queries and file system data.

33
Q

What is server side caching? What is client side caching? What’s an example of both?

A

Server side caching occurs on a server and can include full-page caching, fragment caching, query result caching, precomputed results, and object caching.

Client side caching occurrs in the client, like a phone application or browser. Client-side caching stores frequently accessed data, such as images, CSS, or JavaScript files, to reduce the need for repeated requests to the server. Examples of client-side caching include browser caching and local storage.

34
Q

What is CDN caching and what is it used for?

A

CDN caching stores data on a distributed network of servers, reducing the latency of accessing data from remote locations. This type of caching is useful for data that is accessed from multiple locations around the world, such as images, videos, and other static assets. CDN caching is commonly used for content delivery networks and large-scale web applications.

35
Q

What is DNS caching and what is it used for?

A

DNS cache is a type of cache used in the Domain Name System (DNS) to store the results of DNS queries for a period of time. It’s used to return IPs quickly.

36
Q

What are the three primary caching strategies?

A
  1. Write-through cache: Under this scheme, data is written into the cache and the corresponding database simultaneously.
  2. Write-around cache: This technique is similar to write-through cache, but data is written directly to permanent storage, bypassing the cache.
  3. Write-back cache: Under this scheme, data is written to cache alone, and completion is immediately confirmed to the client. The write to the permanent storage is done after specified intervals or under certain conditions.
37
Q

What are the tradeoffs with write-through caching strategy?

A

Complete data consistency between the cache and the storage. Also, this scheme ensures that nothing will get lost in case of a crash, power failure, or other system disruptions.

Since every write operation must be done twice before returning success to the client, this scheme has the disadvantage of higher latency for write operations.

38
Q

What are the trade-offs with write around caching strategy?

A

It reduces the cache being flooded with write operations that will not subsequently be re-read, but has the disadvantage that a read request for recently written data will create a “cache miss” and must be read from slower back-end storage and experience higher latency.

39
Q

What are the tradeoffs with write-back caching strategy?

A

It results in low-latency and high-throughput for write-intensive applications; however, this speed comes with the risk of data loss in case of a crash or other adverse event because the only copy of the written data is in the cache until the db is synced.

40
Q

What are the main cache invalidation methods?

A
  1. Purge
  2. Refresh
  3. Ban
  4. TTL
  5. Stale-while-revalidate
41
Q

Describe Purge cache invalidation method.

A

The purge method removes cached content for a specific object, URL, or a set of URLs. It’s typically used when there is an update or change to the content and the cached version is no longer valid. When a purge request is received, the cached content is immediately removed, and the next request for the content will be served directly from the origin server.

Appropriate for the times when Cash items that are invalid need to be removed immediately. For example, if a product price is updated.

42
Q

Describe the refresh cache invalidation method.

A

Fetches requested content from the origin server, even if cached content is available. When a refresh request is received, the cached content is updated with the latest version from the origin server, ensuring that the content is up-to-date. Unlike a purge, a refresh request doesn’t remove the existing cached content; instead, it updates it with the latest version.

43
Q

Describe ban cache invalidation.

A

The ban method invalidates cached content based on specific criteria, such as a URL pattern or header. When a ban request is received, any cached content that matches the specified criteria is immediately removed, and subsequent requests for the content will be served directly from the origin server.

This is usually initiated by external events or triggers.

44
Q

Describe TTL cache invalidation.

A

This method involves setting a time-to-live value for cached content, after which the content is considered stale and must be refreshed. When a request is received for the content, the cache checks the time-to-live value and serves the cached content only if the value hasn’t expired. If the value has expired, the cache fetches the latest version of the content from the origin server and caches it.

45
Q

Desribe stale while revalidate cache invalidation.

A

This method is used in web browsers and CDNs to serve stale content from the cache while the content is being updated in the background. When a request is received for a piece of content, the cached version is immediately served to the user, and an asynchronous request is made to the origin server to fetch the latest version of the content. Once the latest version is available, the cached version is updated. This method ensures that the user is always served content quickly, even if the cached version is slightly outdated.

46
Q

What are two cache read strategies?

A

Read aside and read through cache.

47
Q

Describe read aside caching.

A

AKA - Lazy loading. The application (client) directly interacts with the cache and the underlying data source.

On a cache miss:
The application retrieves the data from the underlying source (e.g., a database). Then the application then stores the fetched data in the cache for future use.
On a cache hit:
The application retrieves the data directly from the cache.

The cache is explicitly managed by the application.

48
Q

Describe the pros and cons of read aside caching.

A

Advantages:
Flexibility: The application has full control over caching logic.
Efficient Use of Cache: Only frequently accessed data is loaded into the cache.

Disadvantages:
Increased Complexity: The application must handle cache misses and updates explicitly.
Risk of Stale Data: Requires additional logic to handle cache invalidation or updates.

49
Q

Describe read through caching

A

The application interacts only with the cache.

On a cache miss the cache automatically retrieves the data from the underlying source.
The retrieved data is then stored in the cache for future use.

On a cache hit the data is served directly from the cache.

50
Q

Analyze the pros and cons of read through caching

A

Advantages:
Simplified Code: The application doesn’t handle cache misses or source retrieval. All that logic is handled by the cache instead of the application.
Consistent Access: The cache handles all data interactions, ensuring centralized management.

Disadvantages:
Limited Control: The application has less flexibility in how the cache operates.
Higher Dependency: Relies heavily on the cache implementation for performance and correctness.

Use Cases:
Systems where simplicity and centralized cache management are preferred.
Frequently accessed data that benefits from automated retrieval and storage in the cache.

51
Q

List and describe cache eviction policies.

A

First In First Out (FIFO): The cache evicts the first block accessed first without any regard to how often or how many times it was accessed before.

Last In First Out (LIFO): The cache evicts the block accessed most recently first without any regard to how often or how many times it was accessed before.

Least Recently Used (LRU): Discards the least recently used items first.

Most Recently Used (MRU): Discards, in contrast to LRU, the most recently used items first.

Least Frequently Used (LFU): Counts how often an item is needed. Those that are used least often are discarded first.

Random Replacement (RR): Randomly selects a candidate item and discards it to make space when necessary.

52
Q

Cache eviction vs cache invalidation

A

Invalidation typical requires explicit management - it’s typically manual and triggered by specific events.

Eviction is policy based and automatic.