System Design Flashcards

1
Q

What is latency

A

Time taken to complete an operation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is throughput, and give an example

A

The maximum capacity of a machine.

Ex. 512 Mbps internet connection is a measure of throughput (512 Mb/s)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are principles of designing highly available systems

A

Design redundancy into the system.

Map out the components in the system and identify the parts that are likely to be a single point of failure.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are protocols

A

System of rules and regulations that govern something. They serve as the official way something must be done

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Difference between TCP and HTTP

A

TCP is in charge of setting up a reliable connection between two machines.

HTTP uses this connection to transfer data between the server and the client

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Define data denormalization and how it contributes to performance

A

Refers to a DB optimization technique that decreases use of expensive JOIN queries.

This should be done for frequently derived values, when you don’t alter data sources frequently.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Optimal technique when caching objects

A

Storing an assembled class from the dataset in the DB that involves multiple queries. Can just evict object when any other parts of the data change

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are two ways to achieve asynch

A

Doing the same time consuming work in advance. Pre computation can improve performance.

Send complicated jobs to a queue process, allows users to continue, and then notify when process has finished

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are some things to keep in mind when trying to scale systems?

A

Probe further questions about the user. Where do they live? What are their needs? What size data will they be exchanging? What network limitations are there? When is peak usage?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are web sockets

A

Unlike HTTP connections, sockets establish bi-directional channels of communication between client and server.

New requests are not required to reestablish a new connection. Server is pushing data.

The better option when you need “real-time” data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the basic unit of the WebSocket protocol

A

Frames: data is broken up into a series of chunks and each chunk is wrapped in some metadata to make a frame. It is a binary protocol, so bit manipulation is required to decode messages.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is HTTP long polling

A

Server elects to hold a client’s connection open for as long as possible and deliver a response only after data becomes available. Server holds connection until information becomes available. Client is pulling data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What does leader election refer to

A

When multiple servers in a cluster are created to add redundancy, you can designate one to be a leader to serve as the organizer of some task distributed among many computers.

And if leader fails, others from within the cluster would take its place

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a consensus algorithm

A

Used to give all servers in a cluster an “agreed on” value that they can rely on in the logic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is polling

A

Client pings server for new data in regular intervals. Only to be used if small gaps in data updates is suitable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is rate limiting

A

Server limits the amount of operations attempted by a given user in a certain time window. After that, server will return error if limit exceeded

17
Q

What would be an ideal implementation of storing rate limit data

A

Opt for something like Redis that could reside outside server in distributed network and can do fast lookups of user’s data limits.

18
Q

When is it best to use a relational database

A

Data is homogenous structure that can conform to a predefined schema

19
Q

What is the process of removing duplicate data from a database called

A

Normalization

20
Q

How does indexing help with DB optimizations

A

Finding a value in a table without an index means searching each row for that field, with an index you can quickly search if that value exists. Helps with unique constraints.

Be sure to use indexing on read heavy DBs

21
Q

What are some of the advantages of NoSQL Dbs

A

You don’t need join rows

Don’t need fixed schemas

Each data entry can have it’s own configuration of fields.

22
Q

What are the table and row equivalents in NoSQL dbs

A

Rows are documents

Tables are collections

23
Q

How is data stored in graph databases

A

Data entries are stored as nodes that can store data flexibly

Relationships are stored as edges.

24
Q

When is it best to use graph databases

A

When data looks like a network. They’re also useful for when there are many important relationships between pieces of data.

25
Q

Which database stores data that is massive, volatile and unstructured

A

NoSQL dbs

26
Q

Explain single-master replication is distributed dbs

A

Single master receives all queries to DB. Each slave has a replica of the data. Forwards write queries to slaves to keep them synchronized.

Slaves will promote new master if one fails.

27
Q

Explain multi-master replication

A

If system needs to support thousand of simultaneous writes, single master cannot handle all

All computers in cluster are masters. An LB is used to distribute read and writes equally among all machines in cluster. They propagate queries to each one so they’re all up-to-date.

28
Q

What is the process of sharding data

A

Breaking up databases to store subsets of data. Typically implemented by hashing and allocating certain data to certain shards. Then can use replication among shards.

29
Q

What does it mean for data to have high cardinality

A

That degree to which the data is unique.

30
Q

Explain database federation

A

Separate DBs based on their function to reduce lag for replica synchronization.

App logic will then have to be modified to determine which database to write to