Chapter 5: Coordination & Agreement Flashcards
Synchronous distributed system model
- Each message transmitted over a channel is received within a known bounded time
- Time to execute each step of a process has known lower and upper bounds
- Each process has a local clock whose driftrate from real time has a known bound
Asynchronous distributed system model NN
- Messages may need an arbitrary time to be transmitted
- Each step of a process may take an arbitrary time to complete
- Clock driftrates are arbitrary
Thought experiment - Some takeaways
- Asynchronous model is representative for Internet and many practical distributed applications
- Solution valid for asynchronousd distributed systems are also valid for synchronous ones
- Many design problems can not besolved in an asynchronous world (e.g., when vs. who leads attack)
- Applyt timeouts and timing constraints to reduce uncertainty and to bring synchronous model into the picture
Failure models
- Process and communication channel may fail, i.e., depart from correct and expected behaviour
- Failure model offers understanding for effects of failures
- Failure types
- Omission failures (one ore more responses fail)
- Arbitrary failures (wrong value)
- Timing failures (outside interval)
Process omission failures
- Fails by halting, i.e., crashes (e.g. bluescreen)
-
Fail‐stop failure
- Process halts and remains halted
- Others can detect this state
-
Crash failure
- Process halts and remains halted
- Others can (only) suspect this state
Communication omission failure
- Receiving process does not receive message
- Typically becaue of
- Buffer overlfow
- Network partition
- Failure types
- Send-omission
- Receive-omission
- Channel-omission
- Process vs. communication failure: Generally, impossible to distinguish
In our thought experiment NN
- Could Army i detect whether Army j has been defeated? (i.e., process failed?)
- Assuming the enemy could attack
- While undefeated, send periodic message (e.g. hearbeat)
- Assume enemy can not corrupt message
Arbitrary failures
- Encompasses prior failure classes
- Process not just stops or crashes but processes requests incorrectly, corrupts state, produces inconsistent & incorrect output (commission failures)
- Does not follow agreed protocol, messages get corrupted, messages introduced
- Result:Systembehavesunpredictable
- A.k.a. Byzantine failures / Produces –> produces wrong output
Access to shared variables and RACE condition
- Imaging a globally shared variable counter in a process accessible to multiple threads:
- counter ++
- counter –
- Register could be 4, 6 or 5
- Race condition:
- Several threads manipulate shared data concurrently. The final value of the data depends upon which thread finishes last.
- To prevent race conditions, concurrent processes must be synchronized
- The statements: counter++; counter‐‐; must each be executed atomically.
- Atomic operation means an operation that completes in its entirety without interruption.
- This is achieved through synchronization primitives
- Shared variable accessed in critical section, protected by synchronization primitives
- Known as the critical section problem or as mutual exclusion
Distributed mutual exclusion
- In distributed systems, mutual exclusion is more complex due to lack of:
- Shared memory
- Timing issues
- Lack of a global clock
- Event ordering
- Applications
- Accessing a shared resource in distributed systems
- Communicating over a wireless channel 802.11
Critical section (CS) problem: No shared memory
- System with n processes
- Processes access shared resources in CS
- Coordinate access to CS via message passing
- Application‐level protocol for accessing CS
- Enter_CS() – enter CS, block if necessary
- ResourceAccess() – access shared resource in CS
- Exit_CS() – leave CS
Assumptions for CS NN
- System is asynchronous
- No bound on delays, no bound on clock drift, etc.
- Processes do not fail
- Message delivery is reliable
- Any message sent, is eventually delivered intact and exactly once
- Client processes are well‐behaved and spent finite time accessing resources in the CS
Mutual exclusion requirements CS
-
Safety
- At most one process in the critical section at a time
-
Liveness
- Requests to enter & exit CS eventually succeed
- No deadlock
-
Fairness (order & starvation)
- If one request to enter CS happened‐before another one, then entry to CS is granted in that order
- Requests are ordered such that no process enters the critical section twice while another waits to enter (i.e., no starvation)
Deadlock & starvation
- Deadlock: Two or more processes become stuck indefinitely while attempting to enter and exit the CS, by virtue of their mutual dependency
- Starvation: The indefinite postponement of entry to CS for a process that has requested it.
- Can we order entry to the CS by the times processes requested it?

Performance metrics NN
- Bandwidth: Number of messages sent, received or both
- Synchronization delay: Time between one process exiting the critical section and the next entering
- Client delay: Delay at entry and exit (response time)
- We do not measure client access to resources protected by the critical section (assume finite)
Solution strategies Deadlocks
-
Centralized strategy
- Divide processes into master and slaves, master dictates actions of slaves
-
Distributed strategy: Each process independently decides actions, based on local knowledge of others’ state
- Token‐based: A node is allowed in the critical section (CS) if it has a token. Tokens are passed from site to site, in some (priority) order.
- Non‐token‐based: A node enters CS when an assertion becomes true. A node communicates with other nodes to obtain their states and decides whether the assertion is true or false.
Centralized strategy
- Elect a leader (details, cf. second part of this lecture)
* Meets requirements: Safety, liveness, no starvation
* Does solution meet the ordering requirement?
* Advantages-
Simple to implement
* Disadvantages - Single point of failure
- Bottleneck, network congestion, timeout
-
Deadlock potential for multiple resources with separate servers
* Enter_CS() - Two messages: Request & Grant
- Delays the requesting process by round trip for messages
* Exit _CS() - One message: Release message
- Assuming asynchronous message passing, incurs no delay
-
Simple to implement

Distributed strategy
- In a distributed algorithm, the same decision must be made on each node, independent of the other nodes in the system.
- Selected algorithms
- Ring‐based algorithm
- Logical clocks‐based algorithm (Lamport, 1976)
- Ricart & Agrawala, 1981
- Maekawa, 1985
- Many more
Ring‐based algorithm
- Logical ring of processes
- Each Pi knows its successor, P(i+1) mod N
- Logical topology a priori unrelated to physical topology
Analysis:
- Safe: Node enters CS only if it holds the token
- Live: Since finite work is done by each node (can’t
re‐enter), the token eventually gets to each node
- Fair: Ordering is based on ring topology
- Performance
- Constantly consumes network bandwidth, even when no processes seek entry, except when inside the CS
Synchronization delay: Between 1 and N messages – Client delay: Between 0 and N messages; 1 for exit
- Problems
- Lost token
- Duplicated token
- Failed node

Ricart & Agrawala, 1981, algorithm
- Basic idea
- Processes wanting to enter CS, multicast a request to all processes
- Enter CS, once all have granted request
- Use Lamport timestamps… logical loch value to order requests: , t process identities T the timestamp, Pi the process identifier
- Each process is in one of three states
- Released ‐ Outside the CS
- Wanted ‐ Wanting to enter CS
- Held ‐ Inside CS
- If a process requests to enter CS and all other processes are in the Released state, entry is granted by each process
- If a process, Pi, requests to enter CS and another process, Pk, is inside the CS (Held state), then Pk will not reply, until it is finished with the CS

Ricart & Agrawala algorithm

Fault‐tolerance aspects
of mutual exclusion algorithms
- None of the algorithms tolerates message loss
- Ring‐based algorithm cannot tolerate crash of single process
- Centralized algorithm can tolerate crash of clients that are neither in the CS, nor have requested entry
- Described R & A does not tolerate faults either
Leader election
- Problem: A group of processes, P1, …, Pn, must agree on some unique Pk to be the “leader”
- Often, the leader then coordinates another activity
- Election runs when leader (a.k.a., coordinator) failed
- Any process who hasn’t heard from the leader in some predefined time interval may call for an election
- False alarm is a possibility (new election initiated, while current leader still alive)
- Several processes may initiate elections concurrently
- Algorithm should allow for process crash during election
Process identifier
- Elected leader must be unique
- The one with the largest identifier
- Identifier could be any “useful value” – I.e., unique & totally ordered
- E.g., based on OS process identifiers, IP adr., port
- E.g., based on least computational load
- <1/load, i>, load > 0, i is process ID to break ties
- Each process, Pi, has a variable electedi that holds
- the value of the leader or is “┴” (undefined)



