Introduction - Week 1 Flashcards
What is a system?
A complex whole; A set of connected parts; An organised assembly of resources and procedures (collection of…) united and regulated by interaction or interdependence to accomplish a set of specific functions
Definitions of a distributed system
- A collection of independent computers that appears to its users as a single coherent system.
- A system in which hardware and software components of networked computers communicate and coordinate their activity only by passing messages
- A computing platform built with many computers that:
- Operate Concurrently; have independent clocks
- Are physically distributed; (Have their own failure modes)
- Are linked by a network
- Communicate through messages
Big data
large data sets that are computationally analysed to find patterns, trends, etc…
Cloud computing
use / rent compute resources on demand
fog / edge computing
edge devices do some data processing before sending over to cloud or central servers
compute-intensive
massively parallel applications where 1000s of computers needed at the same time (to solve science and engineering problems)
Modern distributed systems need
- Performance (What about waiting a few minutes for a google answer?)
- To handle failures - provide some fault tolerance (What about machines that keep failing?)
- To manage resources (how do I choose the right amount of cloud resources so that I don’t overpay? How do I use these resources efficiently?)
- To manage efficiently significant data requirements (data-intensive applications)
Fallacies of Distributed Computing
- The network is reliable
- Latency is zero
- Bandwidth is infinite
- The network is secure
- Topology doesn’t change
- There is one administrator
- Transport cost is zero
- The network is homogeneous
All 8 common assumptions are false and prove to be big trouble and painful learning experiences in the long run
Fallacy: The network is reliable
Hardware may fail
- Power failures
- Switches have a mean time between failure (e.g. a router between you and the server)
Implications:
Hardware - weigh the risk of failures versus the required investment to build redundancy
Software - Need reliable messaging; be prepared to retry messages; acknowledge messages; reorder messages (do not depend on message order); verify message integrity, and so on.
Fallacy: Latency is zero
It will take at least 30ms to ping from USA to Europe and back because the speed of light is 300,000km/s
Implications:
You may think it is all ok if you deploy your application on LANs, but you should strive to make as few calls over the network as possible (and transfer as much data out in each of these calls) (computing over a high-latency network means you have to “bulk up”)
Latency
(not bandwidth) how much time it takes for data to move from one place to another: measured in time
Fallacy: Bandwidth is infinite
Constantly grows, but so does the amount of information we are trying to squeeze through it (VoIP, videos, verbose formats like XML, …)
Bandwidth may be lowered by packet loss (usually small in a LAN), we may want to use larger packet sizes.
Implications:
- Compression; try to model/simulate the production environment to get an estimate for your needs (performance modelling)
Bandwidth
How much data you can transfer over a period of time (may be measured in bits/second)
Fallacy: The network is secure
Assume this is false
Implications:
- Need to build security into applications from day 1
- As a result of security considerations, you might not be able to access networked resources, different user accounts may have different privileges, and so on…
Fallacy: topology doesn’t change
The topology doesn’t change so long as we stay in the lab
In the wild, servers may be added and removed often, clients (laptops, wireless and ad hoc networks) are coming and going: the topology is changing constantly.
Implications:
- Do not rely on specific endpoints or routes.
- Abstract the physical nature of your network: The most obvious example is DNS names as opposed to IP addresses.