Systems Design 1 Flashcards
Interview Steps
Clarify Requirements
Back of Envelope Estimation
Define Data Model
High Level Design Drawing
Identify and Resolve what remains
CAP Theorem
Consistency
Availability
Partition Tolerance
You can only choose 2 properties
CP System
Consistency and Partition Tolerance
Data is consistent between all nodes, and maintains partition tolerance (preventing data desync) by becoming unavailable when a node goes down.
Sacrifices Availability so system might not respond during network issues to maintain data accuracy. When partition occurs may make node unavailable to ensure data consistency across nodes.
Banking systems use CP databases because ensuring accurate account balances is more critical than being always available.
Newer systems tend to focus more on availability than consistency
Consistency
data is the same across the cluster, so you can read or write from/to any node and get the same data.
Partition Tolerance
The database continues to work even if there is a network failure or a part of the system is unreachable
Partition
A section of a database that contains its own data and indexes. Splits a large database into smaller parts
Why Partition?
Scaling - easier to scale since its broken into smaller more manageable parts
Performance - Queries can run faster since there is less data to scan
Availability - If one partition fails only a fraction of the data is lost
What databases can use partitioning
SQL databases (MySQL and PostgreSQL)
NoSQL databases (mongoDB and Cassandra)
S3 and Redis
Vertical Partitioning
Multiple Tables - split data across tables with different columns which share a key such as EmployeeID
Horizontal Partitioning
Data is separated by a key such as a region identifier. Each data store shares the same columns and data structure, but can be split across multiple servers. Each partition can also be backed up and restored independently
When to partition
Historical Data - Archive data older than a certain time as read only
Table is greater than 2GB in size
When contents need to be across different types of storage devices
AP System
Availability and Partition Tolerance
Ensures every request (read or write) gets a response even if some parts of the system are down
Sacrifices consistency, so when data is updated on one node it may take a short amount of time before queries to other nodes reflect the change
CA Databases
data is consistent between all nodes - as long as all nodes are online - and you can read/write from any node and be sure that the data is the same, but if you ever develop a partition between nodes, the data will be out of sync (and won’t re-sync once the partition is resolved).
These pretty much don’t exist (never give answer in interview)
When to use Relational Database
Data is structured and you need to handle complex relationships
When to use Non-Relational Databases
Data is unstructured or semi structured
What is Sharding
Horizontal Scaling - same schema but data storage is separated across nodes.
Write Behind
Syncs data asynchronously
Data between cache (redis or memcached) and db (PostgreSQL) may be temporarily out of sync
Write Through
Syncs data synchronously
Data in cache and DB are always in sync. When an update is performed to the cache it is immediately updated in the DB
When to use write behind
When you have a write-heavy workload (e.g. many cache updates) user does not have to wait for changes to be made to DB (This is likely relevant for WandB)
When to use write through
Use write-through when data consistency is critical. E.g. banking
Read Heavy Workload Examples
Content delivery platforms (blogs and streaming sites)
Search engines or dashboards with analytics
Write-Heavy Workload Examples
Event logging systems
IoT platforms or real-time monitoring systems
How to Improve Read Latency
Use caching layers such as redis or memcached to minimize latency
Optimize query patterns and DB indexes
How to improve Write Latency
Use batch writes or asynchronous writes to handle high loads
Avoid heavy constraints or triggers that can slow down writes
What usually takes precedence for Read Heavy Systems
Consistency is often critical for tasks such as analytics and financial data.
Use relational databases or strongly consistent NoSQL options
Write-Heavy Systems
Availability often takes precedence, especially when event logging or monitoring. Use eventually consistent databases like Cassandra or DynamoDB
Availability
Ability to access the cluster even if a node in the cluster goes down.
Partition Tolerance
The cluster continues to function even if there is a “partition” (communication break) between two nodes (both nodes are up, but can’t communicate).