System Design Flashcards

Question 1

Q

To understand the problem you need to do 2 things…

Answer

A

Basic knowledge of the fundamentals of good software design. That is, efficiency in terms of reliability, scalability, maintainability, and cost.
Communication skills that allow you to scope out the problem, collect feature and non-feature requirements, justify your choices, and pivot when necessary.

Question 2

Q

System design framework

Answer

A

Define your problem space (5-10 minutes)
Design your system at a high level (5-10 minutes)
Deep-dive (10-15 minutes)
Identify bottlenecks and scale (10-15 minutes)

Question 3

Q

Examples of non functional requirements

Answer

A

Speed, security, reliability considerations, maintainability and even cost.

Question 4

Q

ScREAM

Answer

A

Scalable, Reliable, Efficient, Available, Maintainable

Question 5

Q

Network Protocols

Answer

A

rules and standards for sending and receiving data over a physical infrastructure

Question 6

Q

TCIP/IP model has how many layers? What are they?

Answer

A

The Network Access Layer or Link Layer - which represents a local network of machines; the “hardware” layer.
The Internet Layer - which describes the much larger network of devices interconnected by IP addresses according to IP protocols (IPv4 or IPv6.)
The Transport Layer - which includes protocols for sending and receiving data via packets, e.g. TCP and UDP.
The Application Layer - which describes how data is sent to and from users over the internet, e.g. HTTP and HTTPS.

Question 7

Q

OSI Model/Open Systems Interface Model layers

Answer

A

Physical Layer
Data Link Layer
Network Layer
Transport Layer
Session Layer
Presentation Layer
Application Layer

Question 8

Q

Load Balancers

Answer

A

type of server that distributes incoming web traffic across multiple backend servers. Load balancers are an important component of scalable Internet applications: they allow your application(s) to scale up or down with demand, achieve higher availability, and efficiently utilize server capacity

You should use a load balancer whenever you think the system you’re designing would benefit from increased capacity or redundancy. Often load balancers sit right between external traffic and the application servers.

Question 9

Q

CDN

Answer

A

a CDN can be thought of as a globally distributed group of servers that cache static assets for your origin server.

Push CDN - devs manually push new assets to cdn server
pros - server stays up to date,
cons- time consuming for devs

Pull CDN - cdn caches assets, if it doesn’t have it then it fetches from origin
pros - easy to maintain, automatically fetches new assets
cons - assets can become stale

Good for reducing latency and serving STATIC assets to many distributed clients

Question 10

Q

CAP stands for

Answer

A

“Consistency”, “Availability”, and “Partition tolerance”.

Question 11

Q

Partition tolerance means

Answer

A

being able to keep the nodes in a distributed database running even when there are network partitions

Question 12

Q

network partition

Answer

A

a (temporary) network failure between nodes

Question 13

Q

Consistency vs Availability

Answer

A

Consistency is the property that after a write is sent to a database, all read requests sent to any node should return that updated data. In the example scenario where there is a network partition, nodes would reject any write requests sent to them. This would ensure that the state of the data on the two nodes are the same

In a database that prioritizes availability, it’s OK to have inconsistent data across the nodes, where one node may contain stale data and another has the most updated data. Availability means that we prioritize nodes to successfully complete requests sent to them. Available databases also tend to have eventual consistency, which means that after some undetermined amount of time when the network partition is resolved, eventually, all nodes will sync with each other to have the same, updated data.

Question 14

Q

When should Consistency or Availability be prioritized?

Answer

A

If you’re working with data that you know needs to be up-to-date, then it may be better to store it in a database that prioritizes consistency over availability. On the other hand, if it’s fine that the queried data can be slightly out-of-date, then storing it in an available database may be the better choice.

Question 15

Q

Why We Need Caching

Answer

A

Caches take advantage of a principle called locality to store data closer to where it is likely to be needed.

In large-scale Internet applications, caching can similarly make data retrieval more efficient by reducing repeated calculations, database queries, or requests to other services.

Question 16

Q

Different Caches

Answer

Study These Flashcards

A

In-memory application cache: Storing data directly in the application’s memory is a fast and simple option, but each server must maintain its own cache, which increases overall memory demands and cost of the system.

Distributed in-memory cache: A separate caching server such as Memcached or Redis can be used to store data so that multiple servers can read and write from the same cache.

Database cache: A traditional database can still be used to cache commonly requested data or store the results of pre-computed operations for retrieval later.

File system cache: A file system can also be used to store commonly accessed files; CDNs are one example of a distributed file system that take advantage of geographic locality.

Question 17

Q

Caching Policies

Answer

Study These Flashcards

A

First-in first-out (FIFO) : Similar to a queue, this policy evicts whichever item was added longest ago and keeps the most recently added items.

Least recently used (LRU): This policy keeps track of when items were last retrieved and evicts whichever item has not been accessed recently.

Least frequently used (LFU): This policy keeps track of how often items are retrieved and evicts whichever item is used least frequently, regardless of when it was last accessed.

Question 18

Q

How can you ensure appropriate cache consistency

Answer

Study These Flashcards

A

write-through cache: updates the cache and main memory simultaneously, meaning there’s no chance either can go out of data. It also simplifies the system.

write-behind cache: memory updates occur asynchronously. This may lead to inconsistency, but it speeds things up significantly.

cache-aside or lazy loading: where data is loaded into the cache on-demand.

Question 19

Q

3 big decisions to make when designing a cache

Answer

Study These Flashcards

A

How big should the cache be?
How should I evict (caching policy) cached data?
Which expiration policy should I choose?

Question 20

Q

SQL Database

Answer

Study These Flashcards

A

relational databases, allows easy querying on relationships between data among multiple tables.

structured data, data model and format of the data must be known before storing anything

ACID compliant, support transactions.

consistency and non relational data

Question 21

Q

NoSQL

Answer

Study These Flashcards

A

do not support table relationships, and data is usually stored in documents or as key-value pairs.

Horizontal Scaling , without table relationships, data in NoSQL databases can be sharded across different data stores, allowing for distributed databases

loss of strong consistency

System Design Flashcards

(21 cards)