System Design 1 Flashcards
It means increasing the resources of a specific node
Vertical Scaling
It means increasing the number of nodes
Horizontal Scaling
It allows a system to distribute the load evenly so that one server doesn’t crash and take down the whole system.
Load Balancer
They can get very slow as the system grows bigger.
Joins in a relational database such as SQL.
It means splitting the data across multiple machines while ensuring you have a way of figuring out which data is on which machine.
Sharding (Data partitioning)
Common ways of partitioning:
- Vertical Partitioning
- Key-Based (or Hash-Based) Partitioning
- Directory-Based Partitioning
Partitioning by feature.
e.g. In a social network you have a table for profiles, one for messages
Vertical Partitioning
It uses some kind of data (e.g. an ID) for the partition.
Key-based Partitioning.
A very simple way to do this is to allocate N servers and put the data on mod(key, n)
Key-based Partitioning
In this scheme you maintain a lookup table for where the data can be found
Directory-Based Partitioning
Two major drop backs in Directory-Based Partitioning:
- The lookup table can be a single point of failure.
- Constantly accessing this table impacts performance.
It is a simple key-value pairing and typically sits between your application layer and the data store.
Caching.
How you cache
You might cache a query and its results directly. Or alternatively you can cache the specific object (a rendered version of the website or a list of the most recent blog posts)
Most important metrics around networking
Bandwidth, throughput, latency
How slow operations should be
Asynchronous
Is the maximum amount of data that can be transferred in a unit of time, it is expressed in bits per second.
Bandwidth
Is the actual amount of data that is transferred.
Throughput
It is how long it takes data to go from one end to the other.
Latency
It is typically used to process large amounts of data.
A MapReduce program.
What does a MapReduce program require.
It requires to write a Map step and Reduce step.
Map takes in some data and emits a <key, value> pair.
Reduce takes a key and a set of associated values and “reduces” them, emitting a new key and value.
Considerations when designing a system.
- Failures
- Availability and Reliability
- Read-Heavy vs Write-Heavy
- Security