L7 - Consensus Flashcards
What does Consensus mean in Distributed System?
Traditional formulation of consensus several nodes come to an agreement about a single value.
What is total order broadcast?
A method used for state machine replication where all nodes receive messages in the same order.
Why is manual leader failover problematic?
It requires human intervention, which is slow and inefficient during unplanned outages.
What are the four key properties of a consensus algorithm?
Uniform Agreement – No two nodes decide differently.
Integrity – No node decides twice.
Validity – A decided value must be proposed by some node.
Termination – Every non-crashed node eventually decides on a value.
Name some common consensus algorithms.
Paxos (Lamport, 1998) – Single-value consensus.
Multi-Paxos – Generalization for total order broadcast.
Raft (Ongaro & Osterhaut, 2014) – FIFO-total order broadcast.
Zab & Viewstamped Replication – Other total order broadcast algorithms.
Why can’t consensus be deterministic in an asynchronous system?
Due to the FLP result (Fischer, Lynch, Paterson), no deterministic algorithm can guarantee termination in a fully asynchronous, crash-stop system.
How do Raft and Multi-Paxos handle leader election?
Use a failure detector (timeouts) to suspect crashes.
Elect a new leader when needed.
Prevent multiple leaders via term-based voting.
What is the “split-brain” problem in leader election?
A situation where multiple nodes incorrectly believe they are the leader.
What are the three states of nodes in Raft?
Follower – Passive, expects heartbeats.
Candidate – Actively requests votes to become leader.
Leader – Sends AppendEntries RPCs and maintains log replication.
What happens if a leader’s term is outdated?
The leader steps down, updates its term, and becomes a follower.
How does Raft ensure log consistency?
Each log entry includes an index, term, and command.
AppendEntries RPCs include the previous log index/term to detect mismatches.
Followers reject entries that don’t match their logs.
How does Raft prevent conflicting leaders?
Each node votes only once per term.
A majority vote is required for election.
If no candidate wins, a new election starts with an incremented term.
What ensures that logs across nodes remain identical?
A leader’s log is always the authoritative source.
Entries must be replicated on a majority of nodes before commitment.
Leaders never overwrite existing committed entries.
What is the purpose of heartbeats in Raft?
Leaders send heartbeats (empty AppendEntries RPCs) to maintain authority.
Followers start a new election if no heartbeat is received within the timeout.
What happens when a new leader is elected?
The new leader synchronizes logs by overwriting conflicting entries on followers.
Any uncommitted entries from the previous leader may be discarded.
How does Raft neutralize old leaders?
Every RPC includes a term number.
If an RPC’s term is outdated, the sender steps down.
This prevents old leaders from making changes.
How do clients interact with a Raft cluster?
Send requests to the leader.
If leader is unknown, contact any node (which redirects to the leader).
The leader commits and executes the command before responding.
How does Raft ensure exactly-once semantics?
Clients embed a unique ID in each command, preventing duplicate execution.
Why can’t a system switch configurations immediately?
It could create conflicting majorities between old and new configurations.
How does Raft handle configuration changes?
Uses a 2-phase joint consensus approach.
Both old and new configurations must approve changes before applying them.
What is ZooKeeper and what functions does it provide?
A KV Store that is distributed across multiple nodes (also known as configuration-membership services):
Designed to hold a small amount of data (ideally fits in memory)
that is replicated across all the nodes using a fault-tolerant total order broadcast algorithm.
How does ZooKeeper ensure linearizable atomic operations?
By using a consensus protocol that provides strict ordering.
How does ZooKeeper detect failures?
Clients maintain long-lived sessions with heartbeat messages. If heartbeats stop, the session is terminated.
Any locks holded by this client are then released.
What is a fencing token in ZooKeeper?
A monotonically increasing number assigned when acquiring a lock to prevent stale processes from interfering.
What is the nextIndex in Raft?
It is the index of the next log entry that the leader will send to a follower.
How does Raft repair log inconsistencies?
If a follower rejects an AppendEntries RPC due to a log mismatch, the leader decrements nextIndex and retries until consistency is restored.
Why does Raft track nextIndex separately for each follower?
Because different followers may have different log inconsistencies that need to be resolved.
When can a leader commit an entry in Raft?
Only when the entry has been stored on a majority of nodes and at least one entry from the leader’s current term is committed.
Why must a leader commit an entry from its own term before committing older entries?
To prevent an outdated leader from overwriting committed log entries.
What happens if a leader crashes before an entry is fully committed?
The next leader may override uncommitted entries to maintain consistency.
Why do consensus algorithms perform worse than asynchronous replication?
Because they require synchronous communication and must reach agreement before processing writes.
What is the minimum number of nodes required for consensus to tolerate 1 and 2 failures?
3 nodes to tolerate 1 failure.
5 nodes to tolerate 2 failures.
Why does Raft struggle with highly variable network delays?
Because leader election relies on timeouts, and if delays fluctuate, false leader elections can occur, causing instability.
How does ZooKeeper detect node failures?
By maintaining long-lived client sessions where heartbeats are exchanged periodically.
What happens if a client stops sending heartbeats to ZooKeeper?
The session is declared dead, and any locks or leases held by the client are automatically released.
How does ZooKeeper ensure time-ordered operations?
By assigning a monotonically increasing transaction ID to each operation, preventing race conditions.
Why do old leaders need to be neutralized in Raft?
Because after a network partition, an old leader may reconnect and try to append outdated log entries.
How does Raft detect and reject outdated leaders?
Every RPC message includes a term number:
1. If a leader’s term is older than the follower’s, the RPC is rejected.
2. If a follower’s term is older than the sender’s, the follower steps down and updates its term.
How does Raft prevent outdated leaders from committing new log entries?
The election process updates the terms of a majority of nodes, ensuring that a deposed leader cannot gain quorum.
How does a Raft leader handle client requests?
The leader appends the command to its log, replicates it to followers, and applies it once a majority acknowledges it.
What 2 RPCs are used in Raft?
RequestVote RPC – Used by candidates to request votes during leader elections.
AppendEntries RPC – Used by the leader to replicate log entries and send heartbeats.
How does Raft prevent log inconsistencies after a leader crash?
The new leader overwrites conflicting entries in followers’ logs using nextIndex tracking and AppendEntries RPC retries.