Introduction - Week 1 Flashcards

1
Q

What is a system?

A

A complex whole; A set of connected parts; An organised assembly of resources and procedures (collection of…) united and regulated by interaction or interdependence to accomplish a set of specific functions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Definitions of a distributed system

A
  • A collection of independent computers that appears to its users as a single coherent system.
  • A system in which hardware and software components of networked computers communicate and coordinate their activity only by passing messages
  • A computing platform built with many computers that:
    • Operate Concurrently; have independent clocks
    • Are physically distributed; (Have their own failure modes)
    • Are linked by a network
    • Communicate through messages
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Big data

A

large data sets that are computationally analysed to find patterns, trends, etc…

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Cloud computing

A

use / rent compute resources on demand

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

fog / edge computing

A

edge devices do some data processing before sending over to cloud or central servers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

compute-intensive

A

massively parallel applications where 1000s of computers needed at the same time (to solve science and engineering problems)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Modern distributed systems need

A
  • Performance (What about waiting a few minutes for a google answer?)
  • To handle failures - provide some fault tolerance (What about machines that keep failing?)
  • To manage resources (how do I choose the right amount of cloud resources so that I don’t overpay? How do I use these resources efficiently?)
  • To manage efficiently significant data requirements (data-intensive applications)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Fallacies of Distributed Computing

A
  1. The network is reliable
  2. Latency is zero
  3. Bandwidth is infinite
  4. The network is secure
  5. Topology doesn’t change
  6. There is one administrator
  7. Transport cost is zero
  8. The network is homogeneous

All 8 common assumptions are false and prove to be big trouble and painful learning experiences in the long run

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Fallacy: The network is reliable

A

Hardware may fail
- Power failures
- Switches have a mean time between failure (e.g. a router between you and the server)

Implications:
Hardware - weigh the risk of failures versus the required investment to build redundancy
Software - Need reliable messaging; be prepared to retry messages; acknowledge messages; reorder messages (do not depend on message order); verify message integrity, and so on.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Fallacy: Latency is zero

A

It will take at least 30ms to ping from USA to Europe and back because the speed of light is 300,000km/s

Implications:
You may think it is all ok if you deploy your application on LANs, but you should strive to make as few calls over the network as possible (and transfer as much data out in each of these calls) (computing over a high-latency network means you have to “bulk up”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Latency

A

(not bandwidth) how much time it takes for data to move from one place to another: measured in time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Fallacy: Bandwidth is infinite

A

Constantly grows, but so does the amount of information we are trying to squeeze through it (VoIP, videos, verbose formats like XML, …)

Bandwidth may be lowered by packet loss (usually small in a LAN), we may want to use larger packet sizes.

Implications:
- Compression; try to model/simulate the production environment to get an estimate for your needs (performance modelling)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Bandwidth

A

How much data you can transfer over a period of time (may be measured in bits/second)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Fallacy: The network is secure

A

Assume this is false

Implications:
- Need to build security into applications from day 1
- As a result of security considerations, you might not be able to access networked resources, different user accounts may have different privileges, and so on…

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Fallacy: topology doesn’t change

A

The topology doesn’t change so long as we stay in the lab

In the wild, servers may be added and removed often, clients (laptops, wireless and ad hoc networks) are coming and going: the topology is changing constantly.

Implications:
- Do not rely on specific endpoints or routes.
- Abstract the physical nature of your network: The most obvious example is DNS names as opposed to IP addresses.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Fallacy: There is one administrator

A
  • Unless we refer to a small LAN, there will be different administrators associated with the network with different degrees of expertise.
  • Might make it difficult to locate problems (is it their problem or ours?)
  • Coordination of upgrades: will the new version of MySQL work as before with Ruby on rails?

Don’t underestimate the human ‘social’ factor!

17
Q

Fallacy: Transport cost is zero

A

Going from the application layer to the transport layer (2nd highest in the five layer TCP/IP reference model) is not free:
- Information needs to be serialised (marshalling) to get data onto the wire.

  • The cost (in terms of money) for setting and running the network is not zero. Have we leased the necessary bandwidth?
18
Q

Fallacy: The network is homogeneous

A

Even a home network may connect a Linux PC and a windows PC. A homogeneous network today is an exception, not a rule!

Implications:
- Interoperability will be needed
- Use standard technologies (not proprietary protocols), such as XML

19
Q

XML

A

A W3C recommended general-purpose markup language designed to share data across different information systems.

Drawback is that it’s slow

20
Q

Markup language

A

Combines text and extra information about the text