Distributed Systems Flashcards
What is a distributed system?
A collection of independent and dynamic components, including hardware, software components and web services which work together and appear to users as a single coherent system. They help us to share resources of components through communication.
What is middleware?
Software technology that enables distributed system components to work together and communicate as if they were virtually non-distributed. It hides DS developers from low-level networking details, and provides abstraction and infrastructure for constructing distributed applications.
What is a remote procedure call?
Where we adopt the traditional paradigm of splitting a system into procedures, and mask remote function calls as being local instead. This is usually implemented with message passing. The programmer executes a remote function without coding the network communication; uses C.
How does RPC work?
The caller sends arguments to the client-side stub, which marshals them, generates ID, starts timer, sends message to server-side stub, which unmarshals and feeds the arguments to the remote function.
Limitations of RPC
Synchronous request/reply interaction means that client may be clocked for a long time if server is overloaded, and slow clients may delay servers. Host information is required, which means location transparency cannot be facilitated. It’s also not object oriented.
Components of RPC
Server: defines the service interface using an interface definition language, which specifies names, parameters, and types for all client-callable procedures
Stub compiler: reads the IDL declarations and produces two stub functions, one server-side and the other client-side
Linking: server programmer implements the service’s functions and links with the server-side stubs, client programmer implements the client program and links it with client-side stubs
What is Object Oriented Middleware?
Remote objects are visible through remote interfaces, RMI masks remote objects as being local using proxy objects, skeleton object directs incoming calls from clients to the appropriate object. The object request broker identifies / discovers remote objects.
Remote Method Invocation
RMI originated from Java to allow object-to-object communication among Java objects for realizing a distributed system. RMI allows us to distribute our objects on various machines and invoke methods on the objects located on remote sites.
What is a registry
A registry is a running process on a host machine, which maintains names of remote objects and helps with looking up those objects. Servers can register their objects, clients can find server objects by name and obtain stubs for them.
What is message-oriented middleware?
Communication is done using messages, which are stored in message queues. Message servers decouple client and server.
Asynchronous interaction, meaning client and server are only loosely coupled compared to RMI and COBRA. Messages are queued and may be processed/filtered/transformed by the message server. The queues are also in persistent storage, and may be processed by intermediate message servers
Properties of OOM
Follows object-oriented programming model, synchronous request/reply interaction, location transparency (the ability to access objects without the knowledge of their location)
What are web services?
Web services provide a service interface enabling users to communicate with its servers through the internet. The service interface describes operations of web services. They use web-based protocols based on HTTP, designed to work over the public internet. This allows these protocols to traverse firewalls and work in a heterogeneous environment. They count as a middleware technology.
What is WSDL
Web Services Description Language: interface description for web services, also has details of communication method and URL
What is replication and what are the two different types?
Replication provides multiple copies of the same data or functionalities (services) in a distributed system. It improves system capabilities in terms of performance, availability and load distribution.
Computation (service) replication: multiple instances of the same functional process are executed, but may run on different hardware and be implemented by different algorithms.
Data replication: same piece of information is being stored on multiple devices.
How are incoming requests received when we have replication?
The front-end received the request, forwards it to replica servers
Rs accept a request and decide the ordering of a request relative to other requests
They process the request
They reach consensus on the effect of the requests
They reply to the front end, optionally processing the response beforehand
Fault-tolerant services
Provide a correct service despite up to N process failures
Each replica is assumed to behave according to the specification of the distributed system, when they have not crashed
When is a service based on replication correct?
If it keeps responding despite failures (failure transparency), and if clients can’t tell the difference between the service they obtain from an implementation with replicated data and one provided by a single correct replica manager.
How are incoming requests received when we have passive replication?
Request: a FE issues the request, containing a unique identifier, to the primary R
The primary processes each request atomically in the order in which they were received
It checks the unique ID, and resends the response if it has already done that request
The primary executes the request and stores the response
If the request is an update, the primary sends the updated state, the response, and the unique identifier to all the backups, which then send an acknowledgement
Response: the primary responds to the FE
Advantages of passive replication?
This type of system can survive up to n replica crashes, when the system comprises of n+1 replicas. It requires very little front-end functionality, only needing to lookup new primary replica when the current one is not available.
What is active replication?
Where the Rs are all state machines playing the same role, organised as a group. They all start in the same state and perform the same task in the same order so that their state remains identical (synchronisation).