System Design Flashcards

1
Q

System Design

Steps

A

Steps

  1. Scope the system
    • Analyze use cases
    • Analyze contraints/capacity estimation
  2. Sketch architecture
  3. Identify bottlenecks (single points of failure)
  4. Analyze scalability
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Constraits/Capacity Estimation

A
  • At least two
    • Amount of traffic
    • Amount of data
  • Can start per month, then compute ‘per second’
  • Additional considerations
    • Peak traffic
    • Throughput
    • Geographic distribution of users
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Use Cases

A
  • Determine what system should do

Examples

  • What should service do
  • UI/API or both
  • Analytics
  • User base
  • Geographic location
  • Peak traffic/time
  • High availability
  • Sessions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Internet Statistics

A
  • 500m tweets per day
  • 40% of the world (3.7B)
  • 3.5B google search a day
  • 1.3T searches a year
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Abstract Design

A
  • Detail the basic components you will need in the system
  • At mimimum, will probably need
    • Application service layer
      • serves the requests
    • Data storage layer
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

vertical scaling

A
  • if you’re running low on processing power (ram, disk space, etc.), get more
  • there’s an upper limit with regads to technology (3Ghz)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

horizontal scaling

A
  • like a data center
  • buy more machines (usually not state of the art)
  • have to distribute inbound request over all servers
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

load

A
  • essentially traffic
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

load balancer

A
  • distributes traffic across various servers when using horizontal scaling
  • load balancer has public ip address, servers it monitors have private addresses
  • sessions are typically saved on individual machines
  • often purchased and organized in pairs, and uses high availability
  • can also take place on DNS level
    • when someone requests abc.com, they get ip of abc.com/city
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

RAID

A
  • Redundant Array of Independent Disks
  • combines multiple physical disks into a single logical disk for the purposes of redundancy
  • different versions
    • RAID0
    • RAID1
    • RAID5
  • Assume you have multiple hard drives
    • RAID0
      • Have two hard drives
      • “Stripe” data across disks; write a little bit to each disk, and then switch while the other one is still saving
      • Effectively doubles hard drive speed
    • RAID1
      • Mirror between disks
      • Every time you write to one, to write to the other
    • RAID10
      • Has 4 hard drives
      • Combination of RAID1 and RAID5
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

memcached

A
  • very fast in-memory cache
  • just key, value store (essentially a giant hash table)
  • widely used around the internet
  • sometimes memcache is placed on application servers, and serves as means for them to interact with each other

Steps

  1. Request comes in from client
  2. Check memcache
  3. If there, send back result
  4. If not, check db, write to memcache
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

multi-tiered architecture

A
  • also called an n-tier architecture
  • client-server setup where application processing and data management are separated
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

database partitioning

A
  • process of splitting a large table into different smaller tables based on some criterion
  • ex: database a for records A-M, b for records N-Z
  • if queries only need access to a subset of full table, can speed up process
  • with partitionary, you can direct queries to databases based on high-level information (user, geographic location)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

high availability

A
  • high uptime
  • setting up server or load balancer whereby they continually send each other heartbeats
  • prevents downtime if one stops working
  • Send each other heartbeats (packets which just indicate that it is working)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

round robbin

A
  • algorithm used to balance load
  • sends request to next server in array incrementally
  • loops around when reaching end of list
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

web server

A
  • respond to http requests with static content
  • respons with html, images, etc.
  • no server-side programming (no business logic)
  • do not generate html dynamically
  • ex: GoDaddy
17
Q

application server

A
  • servers that handle business logic
  • will generally have a framework installed, like Django or Express
  • ex: Heroku
18
Q

bandwidth

A
  • a rate
  • maximum amount of data that can hypothetically move through a system per a unit of time (usually a second)
  • measures in Mbit/s
  • 500mbps ethernet cable
  • ie: the size of the highway, size of the pipe
19
Q

throughput

A
  • a rate
  • similar to bandwidth, but its the actual amount of data that is transmitted per unit time
  • measured in Mbit/s
20
Q

latency

A
  • amount of time it takes to do one thing
  • generally, amount of itme required to transfer smallest packet to get from point A to point B
  • measured in units of time
  • ex: amount of time required for client’s request to reach server
21
Q

redundancy

A
  • storing data in more than one place
22
Q

scalability

A
  • database scalability
    • read/write ratio
    • number of objects
    • size of each object
    • relationships between objects
    • database flavor (no-sql vs relational)
23
Q

load and distribution

A
24
Q

load balancer

A
  • typically purchased and organized in pairs
    *