Distributed OS Basics Flashcards
What is a distributed system?
a system that consists of several computers that do not share a memory or clock and communicate with each other by exchanging messages over a communication network
5 motivations for developing and using a distributed system
Improved price/performance
resource sharing
enhanced performance
improved reliability and availability
modular expandability
Benefit of improved price/performance
equivalent computing power may be obtained by a network of workstations at a much lower cost than a traditional time-sharing system
Benefit of resource sharing
requests for services may be satisfied using hardware/software resources on other computers on the communication network
Benefit of enhanced performance
concurrent execution of tasks and load distributing can lead to improved response time
Benefit of improved reliability and availability
fault tolerance can be achieved through the replication of data and services
Benefit of modular expandability
new hardware and software resources can be added without replacing the existing resources
5 design issues with distributed systems
global knowledge
naming
scalability
compatibility
process synchronization
Issues with global knowledge
the global state of the system is hard to acquire due to the unavailability of a global memory and a global clock along with the unpredictability of message delays
aka its impossible to know the state of the entire system at a given point and thus nothing can assume or be based on an up to date global state being available
Issues with Naming
the directory of all the named objects (services, files, printers, etc) in the system must be maintained to allow proper access.
How is one node in the system supposed to know the name of an object on another node?
does one node hold the records for all named objects?
does every node maintain their own record of all objects in the system?
are the naming records for the system partitioned in some way across all the nodes such that a single node could logically determine where to find an object based on some properties of that object?
Issues with Scalability
any mechanisms or approaches adopted in a system must result in badly degraded performance when the system grows
Issues with compatibility
the interoperability among the resources in a system must be an integral part of the design of a distributed system
Issues with process synchronization
especially difficult in distributed systems due to the lack of shared memory and a global clock
none of the previously discussed methods work for distributed systems
2 inherent limitations of a distributed system
lack of common memory
lack of system-wide clock
Limitations caused by lacking system-wide clock
without global time it is difficult to talk about a temporal ordering of events
Limitations caused by a lack of shared memory
up-to-date information about the state of the system is not available to every process by a simple memory lookup. The state information must be collected through communication
Happened Before Relation
a (arrow) b means that event a happened before the event b
properties of Happened before relation
if a and b are events in the same process then a happened before b
if a is the sending of a message to node i and b is the event receiving the message then a happened before b
if a arrow b and b arrow c then a arrow c is also true
Causal dependency of events
if a arrow b or b arrow a then a and b are causally related.
a arrow b means a casually affects b
if neither are true the events are said to concurrent
Lamport’s logical clocks - timestamp
a number Ci(a) associated with an event ‘a’ in process i
The timestamp of an event will be of the form enk. Where n is the node and k is the value of logical clock at that node
How do Lamport’s logical clocks relate to happened before relationship
‘a’ arrow ‘b’ is true if either of the following is true:
- for ‘a’ and ‘b’ in process i, if Ci(a) < Ci(b)
- if ‘a’ in process i is the sending of a message to ‘b’ in process j and Ci(a) < Cj(b)
Rules of implementing Lamport’s logical clocks
The clock Ci is incremented after each successive event in Pi by a value of d (typically we increment by 1)
If ‘a’ in process i is the sending of a message ‘m’ to process j then ‘m’ is assigned a timestamp called tm = Ci(a).
When process j receives the message it assigns the value Cj according to the first rule described above. Then the final value of Cj = max(current Cj value, tm + d).
Limitation of Lamport’s logical clocks
for any two arbitrary events in ‘a’ in process i and b in process j
Ci(a) < Cj(b) DOES NOT necessarily imply ‘a’ arrow ‘b’
we have no way of knowing the ordering of those events
Vector Clocks
essentially a vector of integer values, ‘C’, maintained by each node. The ith node has its own logical time at Ci[i].
All other elements of vector Ci represent the ith node’s best guess as to the logical time of other nodes.
With that, the following must hold Ci[j] less than or equal to Cj[j]
2 rules of implementing vector clocks
Clock Ci is incremented after successive events in process Pi by a value of d (typically 1)
This second rule is implemented just like lamport’s logical clock but extended for an entire array.
Let ‘a’ in process i, be the sending of a message ‘m’ to process j. When process j receives the message its increments Cj[j] according to the rule above.
Then for each entry Cj[k] (k from 1 to n), Cj[k] = max (Cj[k], Ci[k]) (find the max value at each entry and update Cj as needed)
Definition of causal order of messaging
Send(m1) arrow Send(m2) implies Receive(m1) arrow Receive(m2)
Cut event
is the recording of a single node’s local state at a given time
Definition cut
The set C of cut events ci for all nodes in the distributed system
A cut represents the global state of the system
Consistent cut
A cut C is inconsistent iff for every pair of events ei and ej where the following expression is TRUE
(ei arrow ej) AND (ej arrow cj) AND NOT(ei arrow ci)
AKA every message received before the cut event at Site J was sent from Site i before the cut event at Site i