System Design Basics Flashcards

1
Q

Why is sharding/data partitioning is used?

A

Because sometimes the only viable option in terms of cost and scaling for an application is adding more servers instead of using a more powerful server.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the two sharding methods? Quickly explain.

A
  1. Horizontal: rows of the same table are stored in different servers
  2. Vertical: tables of features are stored in different servers (ex. Users, Photos, UserLikes)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is dictionary based sharding?

A

It is a technique of extracting the sharding logic to a lookup service. This moves the complexity away from the app, which queries the lookup service to know where to store/get data from.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Cite three sharding/partition criteria.

A
  1. Key/hash based: hash function applied to a key of the record yields the partition number
  2. List based (partition per data characteristic): each partition has a list of values (ex: one partition stores users from Norway, Sweden and Finland).
  3. Round robin: rows are inserted in partition nodes in order (using for ex: row_id % n)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Cite three challenges of sharding.

A
  1. Difficulty of Joins and need for denormalization of data
  2. Loss of referential integrity enforcement
  3. Need of rebalancing (re-sharding)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What does the locality of reference principle say?

A

recently requested data is likely to be requested again

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a “distributed cache”?

A

Cache layer is composed of many nodes, each of which stores a piece of the overall cache, in memory.
Usually a consistent hashing is used to determine which node to query for the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the three main schemes for cache invalidation during write?

A
  1. Write through: write both on the cache and the dB (con: higher write latency)
  2. Write around: write to DB and only evict the cache (con : next read will cache miss)
  3. Write back: write to cache and it writes to DB async (con: data loss in case of cache failure)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are indexes used for?

A

To improve performance of read operations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the drawback of using indexes?

A

All write operations are degraded because you have to write also on the index.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a proxy server?

A

A proxy server is an intermediary piece of hardware/software that sits between the client and the back-end server.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Give 3 uses for a proxy server.

A

Request logging
Request filtering
Batch several requests into one

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are queues used for?

A

To enable async communications between systems.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is redundancy?

A

Redundancy means duplication of critical data or services with the intention of increased reliability of the system

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What does the CAP theorem states?

A

CAP theorem states that it is impossible for a distributed software system to simultaneously provide more than two out of three of the following guarantees (CAP): Consistency, Availability and Partition tolerance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How is Consistency achieved in a distributed system?

A

Reads are not allowed until all nodes are updated.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

How is Availability achieved in distributed systems?

A

Data is replicated across multiple servers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What does partition tolerance means?

A

Means that a system continues to work despite message loss or partial failure.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

How is Partition Tolerance achieved in distributed systems?

A

Data is sufficiently replicated across combinations of nodes and networks to keep the system up through intermittent outages.

20
Q

What is Consistent Hashing?

A

It is a technique for hashing that minimizes the impact of adding/removing buckets in the hash table.

21
Q

How does Consistent Hashing works?

A

It works by assigning the buckets to the hash values space (imagine a circle from 0 to N, and the bucket are positioned in that circle).
When we hash(key), the result is a place in the circle. The bucket that will be used is the next bucket found by following the circle of values.

22
Q

What is the disadvantage of ajax polling?

A

The client has to poll the server at a fixed rate, and many of the responses will be empty, creating an unecessary HTTP overhead.

23
Q

What is Websocket?

A

It is a communications protocol that supports full-duplex conversation between the browser and the webserver.

24
Q

What are Server Sent Events (SSE)?

A

It is a technology over HTTP that is used to maintain a long-running connection to the server that allows the server to send a stream of messages back to the client.
It does not allow the client to send messages (simplex).
The server responds with a mime-type “text/event-stream”.

25
Q

Why would I use SSE instead of Websockets?

A

Although everything that SSE does can be done with WS, SSE can have a couple of advantages:

  • it can be polyfilled in older browsers, because it uses AJAX
  • it can be simpler to implement in the webserver
  • WS may not be supported in the webserver (Tomcat 7+)
26
Q

What is Reliability?

A

It is the probability a system will fail in a period of time.

27
Q

What is a reliable system?

A

It is a system that continues to deliver its services even when one or several of its hardware or software components fail.

28
Q

What is availability?

A

It is the time a system remains operational to perform its required function, in a period of time.

29
Q

What is serviceability or manageability?

A

It is the simplicity and speed that a system can be repaired or maintained.

30
Q

Cite three hash algorithms that could be used for sharding

A

MD5
SHA-2
MurMurHash (faster for being non-cryptographic)

31
Q

Cite three technologies for distributed object storage.

A
  • S3
  • HDFS (Hadoop FS)
  • Google Cloud Storage
32
Q

What is a good strategy to avoid that slow write requests (such as uploading a photo) degrades/limits read requests?

A

We can separate the write API and read API in two different services, isolating them.
We also load balance them.

33
Q

What is a Single Point of Failure? How to prevent that?

A

It is a piece of our whole system that if fails will make the whole system to fail.

One way to prevent a SPF is to provide redundancy.

34
Q

Cite three strategies/technologies for generating distributed unique global Ids

A
  • UUID (128bits)
  • Use an ID Generation Server/database (can be a Single point of failure)
  • Use Twitter Snowflake (generates a 64 bit id with EPOCH_TIMESTAMP + NODEID + SEQUENCE)
35
Q

What are the two methods of accessing data on a database?

A
  • By Indexes

- Scan

36
Q

If I just want to scale reads, which technique can I use in storage?

A

Read replication. I write on a master and the data is replicated to slaves.
No need for sharding.

37
Q

Some systems allow writing to multiple nodes and reading from multiple nodes (eg. Cassandra). How would be a good rule of thumb formula to know if I am strong consistent in my operations?

A

Reading Nodes + Writing Nodes > Number of Replicas

The number of nodes I read from plus the number of nodes I write to in greater than the number of nodes that has my replicated data.

By: Tim Berglund video on Distributed Systems

38
Q

Cite three technologies used for distributed computing?

A
  • MapReduce (Hadoop)
  • Spark (similar to MapReduce)
  • Kafka (processes only Stream, real time data, not data in rest)
39
Q

How would a message topic broker scale?

A

By partitioning, the same a storage would (ex: adding more nodes and using consistent hashing).

40
Q

What is the difference between Architectural Style and Architectural Pattern?

A

An Architectural Pattern is a way of solving a recurring architectural problem. MVC, for instance, solves the problem of separating the UI from the model. CQRS is another example.

An Architectural Style, on the other hand, is just a name given to a recurrent architectural design. Contrary to a pattern, it doesn’t exist to “solve” a problem. Such as Ports and Adapter, Layer Architecture

Also, a single architecture can contain several architectural styles, and each architectural style can make use of several architectural patterns.

41
Q

What is CQRS and what is it good for?

A

Command-Query Responsibility Segregation. It is an architectural pattern that prescribes the separation into two interfaces methods that perform mutation in the domain model and methods that performs queries on the domain model.
It is useful because usually the system state the user wants to see is rich and cuts across several different concepts. On the other hand, the state the user wants to mutate tends to be concentrated.

42
Q

What are the three elements that we need to deal with distributed transactions?

A

Commander: the orchestrator
Retry
Idempotence

43
Q

What is idempotence? How do we achieve it?

A

Idempotence is the property of executing an instruction/function twice and the result to be the same.
Usually it is achieved by a duplication detection based on an unique ID (idempotence ID)

44
Q

When do we have distributed transactions?

A

When a single operation triggers a mutation in two or more systems/data sources/resource.

45
Q

What are the two steps in a 2 phase commit protocol?

A
  1. Commit request

2. Commit