Chapter 8: Paxos Flashcards
1
Q
Broader context
A
- Durably persists association of tables, tablets, storage units is compromised of a highly available & reliable service
- ZooKeeper (Chubby)
- Usually five distributed nodes
2
Q
Consensus problems
A
- Desire all processes to agree on a value after one or more processes proposed a value (here, value is update to mapping)
- Also known as problems of agreement
- Challenges
- Reach consensus, even in the presence of failures
- Tolerate crash failures
3
Q
Consensus problem examples
A
- Two armies decide consistently to attack or to retreat critical sector
- Mutual exclusion: Processes agree on who can enter CS
- Leader election: Processes agree on who is elected
- Totally ordered multicast: Processes agree on the order of messages delivered
- ATM and bank’s servers agree what should happen to bank account balance when withdrawing money
- Flight computers decide to “abort” (reboot) or “proceed”
- Transaction managers agree to commit or abort
- Reactor safety systems agree on position of control rods
4
Q
Paxos
A
- Family of protocols for solving consensus among unreliable processes
- Published in 1989 (initially), 1998 (ACM TOCS)
- Fictional legislative consensus system for the island of Paxos
- Consensus: Agreeing on one result among group of participants
- Participants and communication among participants may fail
5
Q
Paxos family
A
- Basic Paxos, Multi-Paxos, Cheap Paxos, Fast Paxos, Byzantine Paxos etc.
- Protocols with spectrum of trade-offs between
- Number of processes
- Number of message delays before learning the agreed value
- Activity level of individual participants
- Number of messages sent
- Types of failures tolerated
6
Q
The FLP result
A
- Fischer, Lynch and Paterson, 1985
- In asynchronous systems, with only one process crashing, there is no guarantee to reach consensus (No also that guumtas consensus )
- Does not mean that consensus can never be reached
- Just under the model’s assumptions, no algorithm can always reach consensus in bounded time
7
Q
Paxos in light of FLP result
A
- No deterministic fault-tolerant consensus protocol can guarantee progress in an asynchronous system
- Paxos guarantees safety, but not progress
- “Conditions that could prevent progress are difficult to provoke.”
- Paxos attempts to make progress even during periods when some bounded number of replicas are unresponsive
8
Q
Use of Paxos
A
- Used in one form or another inside
- Google’s Chubby
- Yahoo’s (Apache) ZooKeeper
- Used where durability is required
- Replicate state
- Replicate files, etc.
- Agree on value (command to execute)
9
Q
Assumptions
A
-
Process
- Operate at arbitrary speed
- May experience failures
- Processes with stable storage may re-join the protocol after failures (i.e., following a crash- recovery failure model)
- Do not collude, lie, or otherwise attempt to subvert the protocol (i.e., no Byzantine failures, cf. Byzantine Paxos)
-
Network
- Processes can send messages to any other process
- Messages are sent asynchronously (i.e., may take arbitrarily long)
- Messages may be lost, reordered, or duplicated
- Messages are delivered without corruption (i.e., no Byzantine failures, cf. Byzantine Paxos)
-
Number
- In general, Paxos can make progress using 2F+1 processes
- Despite the simultaneous failure of any F processes (e.g., 5 processes, resilient against 2 failing) of processes
10
Q
Roles
A
- Express protocol in terms of roles
- Single process may play one or more roles at the same time
- No effect on protocol correctness
- Roles are commonly coalesced to improve latency, number of messages exchanged, etc.
- Roles:
- Client (Application)
- Acceptor (Voters)
- Proposer
- Learner
- Leader (one of the Proposer)
11
Q
Client
A
- A client is typically a system or a system component that requires reliable, system-wide storage of information
- Issues request/response to Paxos-based (distributed) system, e.g., a write request, the value V to be stored
- Examples:
- Metadata (e.g., configuration of BigTable deployment in Chubby)
- Locking information in distributed ME
- Commands issued to replicated server
12
Q
Proposal number & agreed value
A
- Attempt to define an agreed value V via proposals
- Proposals may or may not be accepted by Acceptors
- Proposals are uniquely numbered for a given Proposer
- Value V represents the information that is to be agreed on (replicated and persisted)
13
Q
Proposer
A
- Advocates a client request, attempting to convince the Acceptors to agree on it
- Acts as a coordinator to move the protocol forward when conflicts occur
- Messages sent by Proposer (more details later):
- prepare (P# (N))
- acceptReq (P# (N), Value (V))
- where P# refers to the Proposal Number (cf. below)
14
Q
Acceptor (a.k.a. Voters)
A
- Act as the fault-tolerant“memory” of protocol
- Are collected into groups called quorums (essentially, a majority)
- Any message sent to Acceptor must be sent to a quorum of Acceptors
- Any message received from an Acceptor is ignored unless a copy is received from each Acceptor in a quorum en
- Messages sent by Acceptor (more details later):
- promise (P#, old P# (N’), old Value (V’))
- accepted (P#, V)
- where P# refers to the Proposal Number (cf. below)
15
Q
Learner
A
- Act as replication factor for the protocol
- Once a Client request has been agreed on by Acceptors, Learner may take action (i.e., execute request and send response)
- To improve availability, additional Learners may be added
- Messages sent by Learner:
- clientRes (Value (V))