System Design Flashcards

Question 1

Q

System Design

Steps

Answer

A

Steps

Scope the system
- Analyze use cases
- Analyze contraints/capacity estimation
Sketch architecture
Identify bottlenecks (single points of failure)
Analyze scalability

Question 2

Q

Constraits/Capacity Estimation

Answer

A

At least two
- Amount of traffic
- Amount of data
Can start per month, then compute ‘per second’
Additional considerations
- Peak traffic
- Throughput
- Geographic distribution of users

Question 3

Q

Use Cases

Answer

A

Determine what system should do

Examples

What should service do
UI/API or both
Analytics
User base
Geographic location
Peak traffic/time
High availability
Sessions

Question 4

Q

Internet Statistics

Answer

A

500m tweets per day
40% of the world (3.7B)
3.5B google search a day
1.3T searches a year

Question 5

Q

Abstract Design

Answer

A

Detail the basic components you will need in the system
At mimimum, will probably need
- Application service layer
  - serves the requests
- Data storage layer

Question 6

Q

vertical scaling

Answer

A

if you’re running low on processing power (ram, disk space, etc.), get more
there’s an upper limit with regads to technology (3Ghz)

Question 7

Q

horizontal scaling

Answer

A

like a data center
buy more machines (usually not state of the art)
have to distribute inbound request over all servers

Question 8

Q

load

Answer

A

essentially traffic

Question 9

Q

load balancer

Answer

A

distributes traffic across various servers when using horizontal scaling
load balancer has public ip address, servers it monitors have private addresses
sessions are typically saved on individual machines
often purchased and organized in pairs, and uses high availability
can also take place on DNS level
- when someone requests abc.com, they get ip of abc.com/city

Question 10

Q

RAID

Answer

A

Redundant Array of Independent Disks
combines multiple physical disks into a single logical disk for the purposes of redundancy
different versions
- RAID0
- RAID1
- RAID5
Assume you have multiple hard drives
- RAID0
  - Have two hard drives
  - “Stripe” data across disks; write a little bit to each disk, and then switch while the other one is still saving
  - Effectively doubles hard drive speed
- RAID1
  - Mirror between disks
  - Every time you write to one, to write to the other
- RAID10
  - Has 4 hard drives
  - Combination of RAID1 and RAID5

Question 11

Q

memcached

Answer

A

very fast in-memory cache
just key, value store (essentially a giant hash table)
widely used around the internet
sometimes memcache is placed on application servers, and serves as means for them to interact with each other

Steps

Request comes in from client
Check memcache
If there, send back result
If not, check db, write to memcache

Question 12

Q

multi-tiered architecture

Answer

A

also called an n-tier architecture
client-server setup where application processing and data management are separated

Question 13

Q

database partitioning

Answer

A

process of splitting a large table into different smaller tables based on some criterion
ex: database a for records A-M, b for records N-Z
if queries only need access to a subset of full table, can speed up process
with partitionary, you can direct queries to databases based on high-level information (user, geographic location)

Question 14

Q

high availability

Answer

A

high uptime
setting up server or load balancer whereby they continually send each other heartbeats
prevents downtime if one stops working
Send each other heartbeats (packets which just indicate that it is working)

Question 15

Q

round robbin

Answer

A

algorithm used to balance load
sends request to next server in array incrementally
loops around when reaching end of list

Question 16

Q

web server

Answer

Study These Flashcards

A

respond to http requests with static content
respons with html, images, etc.
no server-side programming (no business logic)
do not generate html dynamically
ex: GoDaddy

Question 17

Q

application server

Answer

Study These Flashcards

A

servers that handle business logic
will generally have a framework installed, like Django or Express
ex: Heroku

Question 18

Q

bandwidth

Answer

Study These Flashcards

A

a rate
maximum amount of data that can hypothetically move through a system per a unit of time (usually a second)
measures in Mbit/s
500mbps ethernet cable
ie: the size of the highway, size of the pipe

Question 19

Q

throughput

Answer

Study These Flashcards

A

a rate
similar to bandwidth, but its the actual amount of data that is transmitted per unit time
measured in Mbit/s

Question 20

Q

latency

Answer

Study These Flashcards

A

amount of time it takes to do one thing
generally, amount of itme required to transfer smallest packet to get from point A to point B
measured in units of time
ex: amount of time required for client’s request to reach server

Question 21

Q

redundancy

Answer

Study These Flashcards

A

storing data in more than one place

Question 22

Q

scalability

Answer

Study These Flashcards

A

database scalability
- read/write ratio
- number of objects
- size of each object
- relationships between objects
- database flavor (no-sql vs relational)

Question 23

Q

load and distribution

Answer

Study These Flashcards

A

Question 24

Q

load balancer

Answer

Study These Flashcards

A

typically purchased and organized in pairs
*

System Design Flashcards

(24 cards)