System Design Flashcards
1
Q
System Design
Steps
A
Steps
- Scope the system
- Analyze use cases
- Analyze contraints/capacity estimation
- Sketch architecture
- Identify bottlenecks (single points of failure)
- Analyze scalability
2
Q
Constraits/Capacity Estimation
A
- At least two
- Amount of traffic
- Amount of data
- Can start per month, then compute ‘per second’
- Additional considerations
- Peak traffic
- Throughput
- Geographic distribution of users
3
Q
Use Cases
A
- Determine what system should do
Examples
- What should service do
- UI/API or both
- Analytics
- User base
- Geographic location
- Peak traffic/time
- High availability
- Sessions
4
Q
Internet Statistics
A
- 500m tweets per day
- 40% of the world (3.7B)
- 3.5B google search a day
- 1.3T searches a year
5
Q
Abstract Design
A
- Detail the basic components you will need in the system
- At mimimum, will probably need
- Application service layer
- serves the requests
- Data storage layer
- Application service layer
6
Q
vertical scaling
A
- if you’re running low on processing power (ram, disk space, etc.), get more
- there’s an upper limit with regads to technology (3Ghz)
7
Q
horizontal scaling
A
- like a data center
- buy more machines (usually not state of the art)
- have to distribute inbound request over all servers
8
Q
load
A
- essentially traffic
9
Q
load balancer
A
- distributes traffic across various servers when using horizontal scaling
- load balancer has public ip address, servers it monitors have private addresses
- sessions are typically saved on individual machines
- often purchased and organized in pairs, and uses high availability
- can also take place on DNS level
- when someone requests abc.com, they get ip of abc.com/city
10
Q
RAID
A
- Redundant Array of Independent Disks
- combines multiple physical disks into a single logical disk for the purposes of redundancy
- different versions
- RAID0
- RAID1
- RAID5
- Assume you have multiple hard drives
- RAID0
- Have two hard drives
- “Stripe” data across disks; write a little bit to each disk, and then switch while the other one is still saving
- Effectively doubles hard drive speed
- RAID1
- Mirror between disks
- Every time you write to one, to write to the other
- RAID10
- Has 4 hard drives
- Combination of RAID1 and RAID5
- RAID0
11
Q
memcached
A
- very fast in-memory cache
- just key, value store (essentially a giant hash table)
- widely used around the internet
- sometimes memcache is placed on application servers, and serves as means for them to interact with each other
Steps
- Request comes in from client
- Check memcache
- If there, send back result
- If not, check db, write to memcache
12
Q
multi-tiered architecture
A
- also called an n-tier architecture
- client-server setup where application processing and data management are separated
13
Q
database partitioning
A
- process of splitting a large table into different smaller tables based on some criterion
- ex: database a for records A-M, b for records N-Z
- if queries only need access to a subset of full table, can speed up process
- with partitionary, you can direct queries to databases based on high-level information (user, geographic location)
14
Q
high availability
A
- high uptime
- setting up server or load balancer whereby they continually send each other heartbeats
- prevents downtime if one stops working
- Send each other heartbeats (packets which just indicate that it is working)
15
Q
round robbin
A
- algorithm used to balance load
- sends request to next server in array incrementally
- loops around when reaching end of list