Distributed Systems Flashcards
What is a stub? And how do RMI and RPC differ?
A stub is something which supports communication between two distributed components. It behaves as a proxy so the distributed system developer can call procedures/methods on the server by calling it on the local stub just like any other method. The stub marshals arguments or return data so they can be sent between components in a machine independent way, the stub then sends the data so the stub on the other end can invoke the process on the server.
Stub also starts a timer for error handling purposes.
What are the components of RPC?
• Server: are remote machine that provides the client with an interface that allows them to remotely call methods.
• Stub compiler: uses an IDL (interface definition language) to compile the stubs for the client and server.
• Linking: the client developer implements client functions calling methods on the stub and the server developer implements the server functions specified by the stub.
Operation: stubs manage all remote communications between the client and server.
What is a object broker in the context of Java RMI?
An object broker is called the RMI registry.
The registry is a process that runs on a host machine itself. The object broker allows for the retrieval and distribution of objects. The RMI registry allows servers to register their objects by name. To access an object clients will search for these objects by name and receive a stub for this remote object. Clients can also request a list of all registered remote objects.
Despite requiring a location to a physical server, why does the remote object broker in Java’s RMI still respect location transparency?
The location of a physical server is provided through a representation that is a string, the protocol then handles the actual mapping of addresses - hence respecting location transparency.
What is the difference(s) between RMI and MOM
Look in notes
What are the modes of operation for JMS (Java message service)?
There are two modes of operation: point-to-point and publish/subscribe.
What are the attributes ofweb services?
• Communication can pass through firewalls by default as theprotocols use HTTP
• As communication uses web based protocols components work in a heterogeneous environment as web communication is machine independent, so it supports interoperability.
XML based - machine readable documents for messages
What is REST?
REST is representational state transfer. It is an architectural style
It treats the web as resource centric.
in a REST app, the urls relate to resources
REST uses JSON instead of XML where objects are key value pairs
Give 4 examples for why replication is used in a distributed system.
• Increased performance: users can have access data in a region closer to them to reduce network communication overhead.
• Increased availability: automatically maintain data or service availability despite server failures.
• Fault tolerance: guarantees strictly correct behaviour despite a certain number of faults.
• Allows systems to updated whilst remaining online.
Load distribution: at busy periods load can be spread across many servers so a single server isnt overloaded
What are the types of replication?
• Computation replication where multiple instances of the same computing task are executed.
Data replication where the same data is stored on multiple devices.
What is needed for replication to be good?
• Replication transparency - hide from the client that there are physicalreplicas so the system appears that there is 1 logical service
Data consistency - the same request will receive the same result regardless of the replica that processes the request
Describe the workflow of the system model.
You have a client that accesses a front-end server which coordinates a set of replica servers and acts as a proxy between an available replica server and a client. The client invokes a method on the stub of a front-end which in-turn invokes a remote method on an available replica server. Once the replica server has processed the request, it can then choose to propagate the results of any updates it has applied to other replica servers in order to maintain consistency. The result of a request will be sent back to the front end so the response can be sent to the client.
Define a fault-tolerant service.
A service is referred to as being fault tolerantif it can provide correct service according to a specification despite having up to f process failures.
(A failure in a distributed system refers to not only the server going down but also providing inconsistent results than those that were expected)
How many faults can a passive replication system tolerate vs an active replication system?
Passivecan tolerate f faults if there aref+1servers.
Activecan tolerate f faults if there are2f+1servers. This is because the front end waits until it has received f+1 responses then can send a response to the client
Describe how passive replication works.
There exists a primary replication manager, which handles requests (from the FE/Client).Messages from the client are have a unique id so a RM can check if it has performedthe operation already.
The primary RM sends routinely updates to the backup RMs, when a backup receives data it will send an acknowledgement to the FE, If the primary RM fails the front end will select one of the backup RMs as the primary RM, this will be relayed to all the other RMs.
Describe how active replication works.
The FE sends the request to every RM (broadcast) and they each carry out the operation. Once completed they return this to the FE, where the most popular response is used.
What are the benefits of each type of replication?
look in notes
What is a Byzantine fault in the context of a distributed system and give two types of faults.
A Byzantine fault is one where the object in question appears different to two observers, in this context it can be the data on the replica and the expected data from the view of the client or it could be the replicas availability. An Omission failure is one where the request is failed to send. Commission failure is processing a request incorrectly which can corrupt internal state.
Explain why a distributed system cannot easily maintain globally unique timestamps. How to overcome this?
In order to maintain globally unique timestamps you require a centralised measurement of the time used within the system. Due to propagation delay, packet loss and latency transferring the value of this global time in the system to other servers can cause the system to become unsynchronised meaning duplicate timestamps could easily be generated. Use NTP to overcome this.
Explain the difference between a deadlock and a phantom deadlock.
A deadlock occurs when transaction A locks object X and transaction B locks object Y If A tries to access object Y after it has been locked A will wait for Y to become unlocked by B. If B tries to access object X after it has been locked it will wait for X to be unlocked by transaction A.
As objects are not unlocked until the locking transaction is finished and the transactions are waiting for each other, the system is halted and A and B can never complete. This situation is a deadlock.
A phantom deadlock is when a deadlock is “detected” but is not really a deadlock. Global deadlock detector requires time to receive local wait-for graphs for constructing the global one. Hence the wait for graph may no longer be valid by the time it is constructed.
What are the two naive approaches for solving detecting/solving. Describe why each is bad.
Timeouts - make it so the a lock is unlocked a certain time after locking. Setting this time is hard because a transaction may be locking shared object may just be taking a long time or doing some heavy computation, and there actually isn’t a deadlock.
Describe how edge chasing works? Why is it the best deadlock detection approach?
Edge chasing works by processes exchanging, probe messages. It does not rely on a centralized server to construct a global wait for graph. Therefore the issue of network latency or lost messages, will not cause a phantom deadlock.
When a process is locked out of a shared object it will send a probe message to the process that is locking the object. The probe message consists of:
id of the process that is blocked
id of the sender of the probe
id of the recipient of the probe
When receiving a probe a server will check if it is waiting for resources:
if it is waiting for a resource it will forward the probe message to the processes that are locking the resource it is waiting for (it will put its process id as the message sender) .
if it is not waiting for a resource it has locked carrying out a transaction and will release the objects when it is done.
If a process receives a probe that it recognizes as having initiated it knows there is a cycle in the system.
if a process recognizes the blocked process ID as its own it’s a deadlock
What is the definition of linearizability?
Operations are linearizable if they can be interleaved in a way that is consistent with the specification of of a single correct copy of the shared objects. This interleaving has to be consistent with the times that the operations happened in the execution.
I.e can you give times that operations happened instantly so that if the operations happened in that order it would produce the results produced by the system. These lines need to appear within the start and finish times of the operations they correspond to.
What are the 3 consistency models, explain them?
Strict consistency - The strongest type of consistency, where every query is up to date. E.g. a query on x is always of the latest version of x. Issues with this are that it needs an absolute global time, hence it is only practical within a single machine.
Sequential consistency - Essentially if it is sequential it is linearizable. One can interleave the operations but, object values must be seen in order by all processors. Where the order of the interleaving must be in the same order as they are executed.
Causal consistency - Updates that are causally related (e.g. they come from the same process) must be seen in the same order. However, if they are not related (e.g. they do not come from the same process) then the order which is seen can be relaxed.