System Designs Fundamentals Flashcards

1
Q

What is a client?

A

A machine or process that requests data or service from a service.

Note that a single machine can be either a client or a server

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a Server?

A

A machine or process that provides data or service from for a client, usually by listening to incoming network calls

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is an IP address?

A

An address given to a machine connected to the public internet. Ipv4 addresses consists of four numbers separated by dots a.b.c.d where all four numbers are between 0-255.

  1. 0.0.1 - localhost
  2. 168.x.y - your private network
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is IP?

A

Internet protocol. This network protocol outlines how almost all machine-to-machine communications should happen in the world. Other protocols like TCP, UDP, and HTTP are built on top of IP

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is TCP?

A

Network protocol built on top of IP. Allows for ordered, reliable data delivery between machines over the public internet by creating a connection.

TCP is usually implemented at kernel level, which exposes sockets to applications that they can use to stream data through an open connection

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is HTTP?

A

The hypertext transfer Protocol is a very common network protocol implemented on top of TCP. Clients make HTTP requests and servers respond with a response.

They usually have the following esquema. Host, Port, method (GET,POST..), headers, body

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is latency?

A

The time that takes for a certain operation to complete in a system. Most often this measure is a time duration, like milliseconds or seconds.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is throughput?

A

The numbers of operations that a system can handle properly per time unit. For ex. the throughput of a server can often be measured in requests per second (RPS or QPS)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is availability?

A

The odds of a particular service being up and running at any point in time, usually measured in percentage. A server that has 99% availability will be operational 99% of the time (having two nines availability)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is High availability?

A

Describe systems that have at least 5 nines or more of availability 99,999%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is redundancy?

A

Process of replicating parts of a system in effort to make it more reliable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is SLA/SLO?

A

SLA is short of service level agreement, SLA is a collection of guarantees given to a customer by a service provider. SLAs typically make guarantees on a system availability. SLAs are made up of one or more SLO.

SLO is service level objective

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are some types of cache Eviction Policy?

A

FIFO, LRU (least recently used), LFU (least frequently used)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is Content Delivery Network?

A

a CDN is a third-party service that acts like a cache for your servers. Sometimes web apps can be slow for users in a particular region. CDN has servers all around the world meaning that the latency o a CDNs service will be always better than to your servers. Most populars CDN are Cloudflare and Google Cloud CDN

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is forward proxy?

A

A server that sits between a client and servers and acts on the behalf of the CLIENT, typically to mask the client’s identity (IP address).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is a reverse Proxy?

A

A server that sits between the client and servers and acts on the behalf of the SERVER, typically for logging, load balancing or caching.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is a load Balancer?

A

A reverse proxy that distributes traffic across servers

18
Q

What is SHA?

A

“Secure Hashing Algorithms”, the SHA is a collection of cryptographic hash functions used in the industry. These days, SHA-3 is a popular choice to use in the system

19
Q

What is an ACID transaction?

A

Atomicity: either fails or succeeds
Consistent: cannot bring the DB to an invalid state.
Isolation: The executions of multiples transactions concurrently will have the same effects as if they have been executed sequentially
Durability: any commited transaction is written to a non-volatile storage

20
Q

What is Prometheus?

A

Popular open source time series database

21
Q

What is Sharding?

A

Sometimes called data partitioning, sharding is the act of splitting the database in two or more pieces called shards and is typically done to increase the throughput of your database

22
Q

What is a Hot spot?

A

When distributing workload, some servers might get more traffics than others. This can happen if your sharding key or hashing function are suboptimal.

23
Q

What is Leader Election?

A

Process by which nodes in a cluster ( for instance, servers in a set of servers) elect a so called “leader” amongst them, responsible for the primary operations that these nodes provides. There are some known well algorithms for that like Paxos and Raft

24
Q

What is pooling?

A

The act of fetching a resource or piece of that regularly at an interval to make sure the data is not too stale

25
Q

What is Streaming?

A

In networking, usually refers to the act of continuously getting a feed of information from a server by keeping an open connection between two machines or processes

26
Q

What dos the Pub/Sub pattern guarantees?

A
  • At least once delivery
  • Persistant storage
  • ordering of messages
27
Q

What is idempotent operations?

A

An operation that has the same ultimate outcome regardless of how many times it’s performed.

28
Q

What is HTTPS?

A

Hypertext transfer protocol Secure is an extension of HTTP that is used for secure communication online.
It requires servers to have trusted certificates (usually SSL certificates) and uses the Transport Layer Security (TLS), security protocol built on top of TCP to encrypt data communicated between a server and a client

29
Q

What is TLS?

A

Transport Layer Security is a security protocol over which HTTP runs in order to achieve secure communication online. HTTP over TLS is HTTPS

30
Q

What is SSL certificate?

A

A digital certificate granted to a server by a certificate authority. Contains the servers public key, to be used a part of the TLS handshake

31
Q

What is a certificate Authority?

A

trusted entity that signs digital certificates - namely, SSL certificates that are relied on HTTPS connections

32
Q

How does a TLS handshake work?

A
  • client sends a client hello - a string of random bytes to the server
  • the server responds with hello server - another string of random bytes - as well as its SSL certificate which contains it’s public key
  • the client verifies that the certificate was issued by a certificate authority and sends a premaster secret - yet another string of random bytes, this time to encrypted with the servers public key
  • client and server uses the client hello, server hello and premaster secret to then generate the same symmetric encryption session keys to be used to decrypt and encrypt all data communicated during the remainder of the connection
33
Q

What is MapReduce?

A

popular framework for processing very large datasets in a distributed setting efficiency, quickly and fault tolerante manner. A MapReduce job is composed of 3 main steps:

  1. the Map step, which runs a map function on the various data chunks of dataset nd transforms these chunks into intermediate key-value pair
  2. the Shuffle step, which reorganizes the intermediate key-value pairs such that pairs of the same keys are routed to the same machine in the final step
  3. the Reduce step, which runs a reduce function on the newly shuffled key-value pairs and transforms them into more meaningful data
34
Q

What is a Distributed File System?

A

Is an abstraction over (usually large) clusters of machines that allow them to act like one large file system. The two most popular implementations of DFS are Google File System (GFS) and the Hadoop Distributed File System (HDFS)

35
Q

What Consistent Hashing can used for?

A

can be used in Load Balancer to distribute load. For example, we will process the requestId into some hashing function to determine to which server it should go to. The bad/good thing is, it might go the same server. That is good to store user sessions in a local cache for example, but it would be bad that these servers would receive a lot of load. Also if you have cache stored in those servers, and you add a new one, all those cache info are lost because the hashing is going to change. But with consistent Hashing, these changes are reduced, because the ranges that it will make, will be equally distributed

36
Q

What is CDN used for?

A

A content distribution network—also known as a content delivery network—is a large, geographically distributed network of specialized servers that accelerate the delivery of web content and rich media to internet-connected devices.

The primary technique that a content distribution network (CDN) uses to speed the delivery of web content to end users is edge caching, which entails storing replicas of static text, image, audio, and video content in multiple servers around the “edges” of the internet, so that user requests can be served by a nearby edge server rather than by a far-off origin serve

37
Q

What are the different types of load balancers?

A
  • Round Robin: Distributed equally in a rotational system manner.
  • IP hash: the client’s IP address is hashed and determines which server receives the request
  • Least connections:
  • Least Response time
  • Least bandwidth
38
Q

Can a API act as a Load Balancer? What are the differences?

A

Yes.
API gateway can replace what a load balancer would usually provide, with a simpler interface, but it doesn’t come cheap.
Nevertheless, API Gateway offers many additional features missing in ALB. For example, it handles authentication and authorization, API token issuance and management, and can even generate SDKs based on the API structure. API Gateway integrates with the IAM (Identity Access Management) service, for example, simplifying access control of the underlying resources.

39
Q

What are the benefits of NoSql?

A
  • flexible data models
  • Horizontal scaling
  • fast queries
  • easy for developers
40
Q

When to use SQL instead of NoSql?

A
  • you are working with complex queries and reports
  • you have a high transaction application
  • you need ACID compliance
  • you don’t anticipate a lot of changes or growth
41
Q

When to use NoSql instead of SQL?

A
  • you are not concerned about that consistency and 100% data integrity is not your top goal
  • you have a lot of data, many different data types and your data will only grow over time
  • your data needs to scale up and down. NoSql provides a much greater flexibility