System Design Flashcards
What is impractical about having one server and a database for designing a system?
It’s not scale able, as the user base grows
What is the solution to having one server and a database?
Distributed system
What is a distributed System?
A network of standalone computers that work together as one
What are the key characteristics of a distributed system?
- Scalability
- Reliability
- Availability
- Efficiency
What is scalability in terms of system designg?
The systems ability to handle growing demand
What is horizontal scaling in system design?
Scale by increasing the amount of servers
What is vertical scaling?
Scale by adding more/upgrading resources
What is a reliable system?
The system remains to operate even when components fail
What is availability in terms of system design?
The percentage of time a system is available
What is efficiency in terms of system design?
Latency - The delay in getting the first response
Throughput - Operations per time unit
What is the typical trade-off when designing distributed systems?
Scalability vs consistency
What is a load balancer?
Balances the load of traffic to each working server on a system
What are some of the load balancer algorithms?
Least connection - Sends requests to the server with the least connections
Round Robin - Iterates through a list of servers
IP Hash - Uses clients IP to identify which server
Can load balancers fail?
Yes, they can become a single point of failure, therefore it is advised to have standby load balancers in case one fails.
What is caching in system design?
Stores recent retrieved data as it anticipates it will be needed again
What are some issues that arise with caching?
Data consistency, we must ensure that the cache data is accurate
How does SQL/relational databases store data?
Stores data in tables with predefined schema
What are NoSQL Databases?
More flexible data-structure than SQL
What are the four main types of NoSQL databases?
- Key-Value Stores
- Document Database
- Wide-column Stores
- Graph Databases
What do we compare when SQL vs NoSQL?
- Structure: SQL has a rigid schema, NoSQL is more flexible
- Querying: SQL uses more structured querying, NoSQL is more focused on collection of documents
- Scalability: SQL vertical scaling, NoSQL horizontal scaling
- Reliability: SQL is ACID compliant, NoSQL is not
When would we use SQL (systems)?
- ACID Compliance is needed
- Financial Applications
- Structured Consistent data
When would we use NoSQL (systems)?
- Large volumes of unstructured data
- Flexibility in data structures
- In need of rapid development that requires change
How do we upgrade database quering speed?
- Indexing - a structure that points to the actual location of the data
What is indexing tradeoffs?
Improve read performance at the cost of write performance
What can you do when a database can no longer scale vertically?
Data partitioning
What is data partitioning?
Breaking a database into smaller manageable partitions
What are different database partitioning methods?
- Horizontal Partitioning (Sharding)
- Vertical Partitioning
- Directory-based Partitioning
What is horizontal partitioning (sharding)?
Divides rows of a table into separate databases
What is vertical partitioning?
Separating features or columns into separate databases
What is directory-based partitioning?
Uses a look-up service to abstract a partitioning scheme
What are some partitioning techniques?
- Using hash functions
- List partitioning
- Round robin
What is the hash function partitioning technique?
Applying a hash function to the key to determine which partition to put the data in
What is the list partitioning technique?
Assign each partition a list of values and storing data based on which list it’s assigned to
What is the round robin partition?
Distribute data evenly through partitions in a circular order
What is composite partitioning?
Partition that combines two or more partition methods
What are the benefits and drawbacks of partitioning?
Solves scaling issues, but introduces the challenge of joining of data across multiple partitions