Week 1 Flashcards
What is a distributed system?
A distributed system is a collection of entities, each of which is autonomous, programmable, asynchronous and failure-prone, and which communicate through an unreliable communication medium
what is the idea behind a distributed system?
Present a single-system image so the distributed system “looks like” a single computer rather then a collection of separate computers.
- Hide internal organization
- Provide a uniform interface
What are the Pro’s of a distributed system?
Easy expandable: adding new computers is hidden from the user
Availability: failure in one component can be covered by other components
What is middleware ?
The middleware is a software layer that is between the application and operating systems.
It allows independent computers to work together closely.
What are some advantages of middleware?
Hides the intricacies of distributed applications
Hides the heterogeneity of hardware, os and protocols
Provides a uniform and high level interfaces used to make an interoperable, reusable and portable application
Provides a set of common services that minimizes duplication of efforts and enhances collaboration between application
What does a middleware do?
They provide protocols that allow a program running on one kind of computer, using one kind of operating system, to call a program running on another computer with a different operating system.
What are the four goals of a distributed system?
Resource Accessibility
Transparency
Openness
Scalability
What is resource accessibility?
Resource sharing introduces security problems
Support user access to remote resources (printers, data files, web pages, cpu cycles) and the fair sharing of resources
Performance enhancement - due to multiple processors
What is Transparency?
A distributed system that appears to its users & applications to be a single computer system is said to be transparent
Users & apps should be able to access remote resources in the same way they access local resources.
Transparency has several dimensions
What is Openness?
An open dist sys is one that is able to interact with other open dis systems even if the underlying environments are different. This is accomplished via:
Well defined interfaces
Should be able to support application portability
Systems should be able to interoperate
What is scalability?
2 dimensions to scale:
1) With size
2) Geographical distribution
A scalable sys still performs well as it scales along any of the two dimensions.
What are the advantages of Openness?
- Interoperability: The ability of two different systems or applications to work together:
1) A process that needs a service should be able to talk to any process that provides that service
2) Multiple implementations of the same service may be provided, as long as the interface is maintained. - Portability: an application designed to run on one dis sys can run on another dist sys which implements the same interface
Extensibility: easy to add new components, features.
What is Caching?
Idea: Normally creates a replica of something closer to the user
Replication is often more permanent
User decides to cache, server system decides to replicate
Why is Caching good?
lol
What are some cons of Caching?
- Having multiple copies leads to inconsistencies: modifying one copy makes that copy different to the rest
- Always keeping copies consistent requires global synchronization on each modification.
Global synchronization precludes large-scale solutions
What Impacts scalability?
Scalability is negatively affected when the system is based on:
- Centralized server: one for all users
- Centralized data: a single database for all users
- Centralized algorithms: one site collects all information, processes it, distributes the results to all sites:
What is decentralization?
- NO machine has complete information about the system state
- Machines make decisions based only on local information
- Failure of a single machine doesn’t ruin the algorithm
How can we achieve Decentralization?
A dist sys must avoid centralising:
- Components(e.g., avoid having a single server)
- Tables( avoid having a single centralised directory of names)
- Algorithms( avoid algorithms based on complete information)
What are the three main types of distributed systems?
Distributed Computing Systems:
- Clusters
- Grids
- Clouds
What are Clusters?
A collection of similar processors running the same operating system, connected by fast LAN.
Parallel computing capabilities using inexpensive PC hardware.
EG: High Performance Clusters (HPC)
- CERN
- Run large parallel programs
Scientific, military, weather modelling
What are Grids?
Grid computing is the use of widely distributed computer resources to reach a common goal.
Similar to clusters but processors are more loosely coupled, tend to be heterogeneous (hardware, software, networks, security polices) and are not all in a central location.
Can handle workloads similar to those on supercomputers, but grid computers, connect over a network and supercomputers CPU’s connect to high speed internal networ
What is Cloud Computing?
Cloud computing is a type of internet-based computing where an application doesn’t access the resources directly, rather it makes a huge resource pool through shared resources. It is modern computing paradigm based on network technology that is specially designed for remotely provisioning scalable and measured IT resources.
How is cloud computing different from Grid computing?
Cloud is:
- Commercial
- Single domain
- Simple user to provider model, pay per use
- Provides a scalable standard environment for network-centric application development, testing and deployment.