Distributed Systems Flashcards

1
Q

What is a distributed system?

A

A system that is distributed in nature, where components work together as one cohesive unit. It is fault tolerant and horizontally scalable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the main advantages of distributed systems?

A
  1. Horizontal scalability, 2. High efficiency for given infra costs, 3. High availability
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the main disadvantages of distributed systems?

A
  1. Increased complexity, 2. Requires expertise from multiple domains, 3. Data duplicacy, 4. Difficult data migrations, 5. Increased networking costs, 6. More difficult to secure, 7. Challenging deployments and troubleshooting
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is reliability in distributed systems?

A

The ability of a system to perform its required functions under stated conditions for a specific period of time. It’s a measure of continuity of correct service.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is availability in distributed systems?

A

The proportion of time for which a system can perform its function as seen from a client’s perspective. It’s measured in percentage units with respect to time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is scalability in distributed systems?

A

The property of a system to be able to meet increased load by adding proportional amount of resources without negatively impacting performance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is fault tolerance in distributed systems?

A

The ability of a system to detect a fault and instantaneously switch to the redundant copy of the component with almost negligible downtime.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is consistency in distributed systems?

A

The ability of a system to maintain a single, up-to-date copy of the data, irrespective of how widely distributed it is.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What does the CAP theorem state?

A

In a distributed system, one can only have either a consistent system or an available system in a partitioned network state.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the PACELC theorem?

A

In case of network partitioning (P), choose between availability (A) and consistency (C); Else (E), choose between latency (L) and consistency (C).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the ACID properties?

A

Atomicity, Consistency, Isolation, Durability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is atomicity in ACID properties?

A

A transaction must be treated as an atomic unit; either all of its operations are executed or none.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is consistency in ACID properties?

A

The database must remain in a consistent state after any transaction.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is isolation in ACID properties?

A

All transactions will be carried out and executed as if it is the only transaction in the system.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is durability in ACID properties?

A

The database should be durable enough to hold all its latest updates even if the system fails or restarts.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is a dirty read in concurrency control?

A

When one activity reads an uncommitted change made by another activity that is later rolled back.

17
Q

What is a non-repeatable read in concurrency control?

A

When one activity reads data, and another activity deletes that data before the first activity is done.

18
Q

What is a phantom read in concurrency control?

A

When one activity retrieves a set of data, and another activity inserts new data that would have met the first activity’s search criteria.

19
Q

What is pessimistic locking?

A

An approach where an entity is locked in the database for the entire time that it is in application memory.

20
Q

What is optimistic locking?

A

An approach that detects collisions when they occur and then resolves them, rather than trying to prevent them.

21
Q

What are the two main types of database storage engines?

A
  1. B-Tree Based Engine, 2. Log Structured Merge (LSM) Tree Based Engine
22
Q

What are the main data replication strategies?

A
  1. Log-Based Data Replication, 2. Full Table Data Replication, 3. Key-Based Incremental Data Replication
23
Q

What is algorithmic sharding?

A

Computing hash of the key in a record and computing modulo-n of that hash where n is the number of nodes.

24
Q

What is consistent hash sharding?

A

Uses consistent hashing technique to distribute data across nodes, with many more shards than actual number of nodes.

25
Q

What are the fallacies of distributed computing?

A
  1. The network is reliable, 2. Latency is zero, 3. Bandwidth is infinite, 4. The network is secure, 5. Topology doesn’t change, 6. There is one administrator, 7. Transport cost is zero, 8. The network is homogeneous
26
Q

Name some common concerns in microservices architecture.

A

Configuration Management, Service Discovery, Load Balancing, API Gateway, Security, Centralized Logging and Metrics, Distributed Tracing, Resilience and Fault Tolerance, Autoscaling and Self-Healing

27
Q

What are some commonly used building blocks in a distributed system?

A

Load Balancer, Microservices, Caching Layers, Unique ID Generation Service, Scalable Databases, Schema Registry, Authentication Service, Service Discovery, API Gateway, Message Queues, Reverse Proxies, CDNs, Object Stores, Batch Jobs, Logging and Monitoring Dashboards

28
Q

What are some key design rules for building distributed systems?

A
  1. Design for extra scale, 2. Avoid bleeding edge tech, 3. Optimize for important features, 4. Use generally available components, 5. Use caches extensively, 6. Use queues for transient data, 7. Avoid transactions, 8. Minimize IO, 9. Aim for idempotence, 10. Know your scale numbers