System Design - General Flashcards
Latency versus Throughput
Latency is the time needed to perform some action.
Throughput is the number of actions per unit time.
Generally, you want to maximize throughput for acceptable latency.
Performance Requirements
- Scale
- Latency
- Consistency
- Uptime
- Reliability (data loss scenarios)
- Fail open / Fail closed
Approach to System Design Interviews
- Clarify requirements
- Functional Requirements
- Performance Requirements
- APIs
- Back of the envelope estimations
- Data Model
- High level design
- Individual component design
- Monitoring & Observability & Alerting
CPU vs Core
The core is the processor part of the CPU. In the past, each CPU had a single processor (a single core). Now a single CPU can have multiple processors.
CAP Theorem
Consistency, Availability, Partition Tolerance
You can only have two
Strong Consistency
Every read receives the most recent write or an error
Data is the same across the cluster so you can read and write to/from any node and get the same data
Availability
Ability to access the cluster even if a node in the cluster goes down
Every request receives a response but the response may not reflect the most recent data
Partition Tolerance
A partition means two nodes in the cluster are unable to communicate with one another
Partition tolerance means the system continues to operate even if there is a communication failure (a partition) between nodes due to network failures
Since networks aren’t reliable, you need to support partition tolerance. That means the tradeoff becomes between consistency and availability.
Consistency and partition tolerance
CP system
Waiting for a response from a partitioned node might result in a timeout or an error. Good choice if your business requires atomic reads and writes.
Consistency and Availability
CA system
Not possible unless you’re ok with losing data once the network failure (the partition) is resolved
Since network failures are inevitable, the real tradeoff is between consistency and availability
Availability and partition tolerance
The response returns the most readily available version of the data available on any node (which might not reflect the most recent write). Good choice is you need eventual consistency or when the system needs to continue working despite errors.
Weak consistency
After a write, reads may or may not see it. A best effort approach is taken.
Examples: VoIP, video chat, realtime multiplayer games. If you lose reception for a few minutes, when you reconnect you don’t hear what you missed.
Eventual consistency
After a write, reads will eventually see it. Data is replicated async. Works well in highly available systems.
Examples: DNS, Email
Strong consistency
After a write, reads with see it. Data is replicated synchronously.
Examples: transaction based systems
High Availability Patterns
Fail-over and replication
These are complimentary, not mutually exclusive
Active-passive failover
Also referred to as master slave
If the active server stops sending heartbeats, the passive server takes over.
Only the active server handles traffic
Active-active failover
Both servers are managing traffic with the load spread between them.
Also called master-master failover
99.9% uptime (three nines)
Just under 9 hours downtime a year (8h 45 min)
99.999% uptime (five nines)
5 minutes per year
99.99% uptime (four nines)
Just under 1 hour a year (52 min)
Spanner
GCP high availability global SQL database
Not truly CAP but truly high availability (5 9s). One failure in 10^5 reads or writes.
In the event of partitions, Spanner becomes CP.
Vertical partitioning
- Some columns are moved to new tables. Each table contains the same number of rows but fewer column
- Can be used to isolate rarely used columns
Horizontal partitioning
- also known as sharding
- it divides a table into multiple smaller tables. Each table is a separate data store, and it contains the same number of columns, but fewer rows
Horizontal Sharding Pros & Cons
Pros
- Facilitate horizontal scaling. Sharding facilitates the possibility of adding more machines to spread out the load.
- Reduces response time. By sharding one table into multiple tables, queries go over fewer rows, and results are returned much more quickly.
Cons
- The order by operation is more complicated. Usually, we need to fetch data from different shards and sort the data in the application’s code.
- Uneven distribution. Hot spots. Some shards may contain more data than others (this is also called the hotspot).
Sharding routing algorithms
- Range-based sharding. This algorithm uses ordered columns, such as integers, longs, timestamps, to separate the rows.
- Hash-based sharding. This algorithm applies a hash function to one column or several columns to decide which row goes to which table.
p99 latency
99% of requests should be faster
Only 1% of requests are expected to the slower than this number
Web socket
- Most common solution for sending async updates from client to server
- The connection is initiated by the client. After that, it is bidirectional and persistent.
CDN
Content Delivery Network
Content Distribution Network
Geolocates data closer to user to reduce load speeds
BLOB
Binary large object
A collection of binary data stored as a single entity in a database
Video encoding / transcoding
It is the process of converting a video format to other formats (MPEG, HLS, etc), which provide the best video streams possible for different devices and bandwidth capabilities.
codecs
These are compression and decompression algorithms aim to reduce the video size while preserving the video quality.
Bloom Filter
- space-efficient probabilistic data structure
- used to test whether an element is a member of a set.
- False positive matches are possible, but false negatives are not – in other words, a query returns either “possibly in set” or “definitely not in set”.
Consistent hashing
Consistent hashing is a special kind of hashing such that when a hash table is re-sized and consistent hashing is used, only k/n keys need to be remapped on average, where k is the number of keys, and n is the number of slots. In contrast, in most traditional hash tables, a change in the number of array slots causes nearly all keys to be remapped