System design interview an insider guide Flashcards

Question 1

Q

What DNS?

Answer

A

Is the 3rd party web ip dictioniary. Returns IP in return for domain name during browser call.

Question 2

Q

What Problem DNS resolves?

Answer

A

Hard to use IP numbers by humans

Question 3

Q

What is IP?

Answer

A

It’s a internet number of device connected to internet. public ip are unique in the internet, private not. 192.168.0.3 - gdzie pierwsze trzy człony to część sieciowa a ostatnia to numer urządzenia w sieci.

Question 4

Q

What is http?

Answer

A

It’s internet protocol, so it’s shape of requests and response which you handles

Question 5

Q

How looks http request?

Answer

A

http_method path http_version
headers
blank line
optional body
*path is url without things taken from cotext, like ‘http://’, like domain(because usually its as ‘host’ header in request

Question 6

Q

How looks http response?

Answer

A

http_version status_code code_description
headers
blank line
returned body

Question 7

Q

What is option http method type?

Answer

A

Usually used before main rquest with GET/POST.. to check what http method can be used on resource. Knowledge is taken from response’s header Accept

Question 8

Q

is dns used during each call by browser?

Answer

A

nope, there is DNS cache in OPERATING SYSTEM

Question 9

Q

What problem NoSQL resolves?

Answer

A

Performance of queries is low and shape of data is unstructured.

Question 10

Q

What are examples of noSQL?

Answer

A

MongoDB, redis, elasticsearch, apache cassandra

Question 11

Q

What scaling is?

Answer

A

Scaling is the process of adding more power to your infrastracture.

Question 12

Q

What problem vertical scaling resolves?

Answer

A

Problems with lack of memory/CPU for application

Question 13

Q

How vertical scaling resolves problem?

Answer

A

Increasing CPU/memory on specific machine.

Question 14

Q

What problem horizontal scaling resolves?

Answer

A

It’s impossible to add unlimited CPU/memory to machine
what will happen when machine will fail

Question 15

Q

How horizontal scaling resolves problem?

Answer

A

Adding servers

Question 16

Q

What is failover?

Answer

A

That’s process of switching wrong dead server to live server.

Question 17

Q

What is redundancy of system?

Answer

A

It’s capacity of system to overcome broken elments (like keeping server turned on as a backup) or having kubernetes for quickly setup new server.

Question 18

Q

What is load balancer?

Answer

A

It’s unit in system which receives requests to server and distrubutes them through machines.

Question 19

Q

What problem load balancer resolves?

Answer

A

cloggeed machine with handling requests
failed machines

Question 20

Q

How servers under load balancere communicate each other?

Answer

A

They’re all on the same network.

Question 21

Q

What is data replication?

Answer

A

It’s master db server with slaves db servers.

Question 22

Q

What problem data replication it resolve?

Answer

A

Failed db server

Question 23

Q

What db operations support master server in db replication process?

Answer

A

Only write, unless there is no sleeves, so it handles writes and reads.

Question 24

Q

What db operations supports sleeve server in db replication process?

Answer

A

Only read.

Question 25

Q

Why there is more sleeves than masters?

Answer

A

In standard system, there are more read operations than write.

Question 26

Q

What are pros of data replication system?

Answer

A

better performance (more db servers can handle more queries)
better reliability (any failed db server can be replaced by another

Question 27

Q

What will happen when master db server is down?

Answer

A

One of sleeves became new master server

Question 28

Q

What is caching?

Answer

A

It’s memory db which serves responses quicker than sql server

Question 29

Q

What problem caching resolves?

Answer

A

Long waiting time for expensive responses

Question 30

Q

What is read-through cache strategy?

Answer

A

It’s strategy used only for system which fetch something. During request, there is check if cache contains such a response. If not, real request is made and result is saved in cache and returned to caller. Next time response will be returned from cache.

Question 31

Q

Why use expiration policy in caching?

Answer

A

It’s good to make sure that cached data is not refreshed every 10 second (cache doesn’t use own capabilties) or cached data is not refresherd every 10 months (risk that such a data is stale is high).

Question 32

Q

What is going on when caching server is down?

Answer

A

Another cache server should be available or we assuming that loas of cache is not such a problem

Question 33

Q

What is evicition policy?

Answer

A

Cache has some size. If data is close to fill it, some data should be removed to save new one. There are couple of strategies, like LRU (least recently used) when least recently cache value is deleted.

Question 34

Q

What is cdn?

Answer

A

It’s geographically distributed cache servers with page static content.

Question 35

Q

What problem cdn resolves?

Answer

A

Long waiting time for response with static content.

Question 36

Q

How caching resolves problem of long requests

Answer

A

Execution time of requests depends on DISTANCE between client computer and server. Using closer servers can be significant improve.

Question 37

Q

What is static content which is persist by CDN servers?

Answer

A

Images, CSS, javascript fiels

Question 38

Q

What is cookie?

Answer

A

Data stored on web browser. It is send as “set-cookie” header during server -> client and in “cookie” header during client -> server.

Question 39

Q

How many cookie key,value pairs can be in cookie headers?

Answer

A

Multiply of them.

Question 40

Q

When cookie is added to request to specific domain?

Answer

A

Cookies are added to EACH request by the browser.

Question 41

Q

Does cookie can live forever?

Answer

A

Yep but usually there is possiblity to set expiration date for cookie.

Question 42

Q

What is CSRF?

Answer

A

Situation where there is “bad guy page” and underneath it send request to domain where there is a chance that we keep cookie.

Example: this domain is our bank and “bad guy page” is sending request with money transfer for hacker. But they aren’t authenticated. Unfortunatelly cookies are AUTOMATICALLY sent with request to cookie domain, so cookie from our session to bank will be added to request and we lost money

Question 43

Q

What is session?

Answer

A

Is the way to remember that user was authenticated - on backend record in db is created and sessionId from it is returned to client as a cookie. Now each request from client needs to have such a cookie in requests.

Question 44

Q

What is back-of-the-envlope estimation?

Answer

A

It’s quick and rough estimation of “system solution” done with minimal details and assumptions, often using simple math and logical reasoning. Should be small, that’s why envelope.

Question 45

Q

What is “power of two” in the context of calculation of system ?

Answer

A

Data is measured in the way that each unit has value which corresponts to 2^x. Easy to remember crucial units.

Question 46

Q

What is basic unit in computers?

Answer

A

One byte - it contains 8 bits.

Question 47

Q

What is kilobyte in ‘power of two’?

Answer

A

2^10 (thousand) -> 1024

Question 48

Q

What is megabyte in ‘power of two’?

Answer

A

2^20 (milion)

Question 49

Q

What is gigabyte in ‘power of two’?

Answer

A

2^30 (bilion)

Question 50

Q

What is terabyte in ‘power of two’?

Answer

A

2^40 (trilion)

Question 51

Q

In which units we calculate ‘availability’ of system?

Answer

A

In percentages of time being available. 99 % means that 15 minutes daily system can be down, 99,99% means ~9 sec.

Question 52

Q

Divide system design interview on parts

Answer

A

Q&A
high-level design
detailed design
wrap up

Question 53

Q

How long should take Q&A on system design interview?

Answer

A

3-10 minutes

Question 54

Q

How long should take ‘high-level design’ on system design interview?

Answer

A

10-15 minutes

Question 55

Q

How long should take ‘detailed design’ on system design interview?

Answer

A

10-25 minutes

Question 56

Q

What you should do during ‘high-level design’ on system design interview?

Answer

A

initial blueprint of design
back-of-the-envelope calculations
go through FEW concrete cases

Question 57

Q

What rate limiter does?

Answer

A

It is used to control the rate of traffic.

Question 58

Q

Pros of rate limiter usage?

Answer

A

prevents DoS attacks
reduce costs (fewer requests means less money)

Question 59

Q

Why is worth to ask question about “distributed environment” during system design interview?

Answer

A

If something works in “distributed environment”, that means that system needs to be prepared for adding new nodes, services, for scaling. It smells like some caching.

Question 60

Q

Why rate limiter on the client side implementation is the bad idea?

Answer

A

possibility that requests can easily be forged by malicious actors.
no control on UI in some cases

Question 61

Q

Where you can put rate limiter in the system?

Answer

A

client side
middleware (gateway)
server side

Question 62

Q

When useage commercial API gateway as rate limiter is better than building own one.

Answer

A

Building the own rate limiter takes time. If there is lack of engineering resources - use commercial one.

Question 63

Q

What algorithm can be implemented as rate limiter?

Answer

A

Token bucket

Question 64

Q

How works “Token bucket” algorithm?

Answer

A

There is a container with fixed size which contains tokens. Each request entering the system gets a token. If the container is empty, the request is dropped. There is a fixed period of time, after which the container gets new tokens (only to fill out original size)

Answer 65

A

It depends on the system - if requirements are detailed, like “5 new posts per day” or “6 queries per IP” it means that for each user and API needs to be created a separate bucket.

Answer 66

A

Two:
- bucket size
- Refill rate

Answer 67

A

rate headers:
- X-Ratelimit-Remaining
- X-Ratelimit-Limit
- X-Ratelimit-Retry-After

Answer 68

A

Usage of centralized date store like Redis.

Brainscape's Knowledge GenomeTM

System design interview an insider guide Flashcards

Brainscape's Knowledge Genome^TM