Introduction - Week 1 Flashcards

Question 1

Q

What is a system?

Answer

A

A complex whole; A set of connected parts; An organised assembly of resources and procedures (collection of…) united and regulated by interaction or interdependence to accomplish a set of specific functions

Question 2

Q

Definitions of a distributed system

Answer

A

A collection of independent computers that appears to its users as a single coherent system.
A system in which hardware and software components of networked computers communicate and coordinate their activity only by passing messages
A computing platform built with many computers that:
- Operate Concurrently; have independent clocks
- Are physically distributed; (Have their own failure modes)
- Are linked by a network
- Communicate through messages

Question 3

Q

Big data

Answer

A

large data sets that are computationally analysed to find patterns, trends, etc…

Question 4

Q

Cloud computing

Answer

A

use / rent compute resources on demand

Question 5

Q

fog / edge computing

Answer

A

edge devices do some data processing before sending over to cloud or central servers

Question 6

Q

compute-intensive

Answer

A

massively parallel applications where 1000s of computers needed at the same time (to solve science and engineering problems)

Question 7

Q

Modern distributed systems need

Answer

A

Performance (What about waiting a few minutes for a google answer?)
To handle failures - provide some fault tolerance (What about machines that keep failing?)
To manage resources (how do I choose the right amount of cloud resources so that I don’t overpay? How do I use these resources efficiently?)
To manage efficiently significant data requirements (data-intensive applications)

Question 8

Q

Fallacies of Distributed Computing

Answer

A

The network is reliable
Latency is zero
Bandwidth is infinite
The network is secure
Topology doesn’t change
There is one administrator
Transport cost is zero
The network is homogeneous

All 8 common assumptions are false and prove to be big trouble and painful learning experiences in the long run

Question 9

Q

Fallacy: The network is reliable

Answer

A

Hardware may fail
- Power failures
- Switches have a mean time between failure (e.g. a router between you and the server)

Implications:
Hardware - weigh the risk of failures versus the required investment to build redundancy
Software - Need reliable messaging; be prepared to retry messages; acknowledge messages; reorder messages (do not depend on message order); verify message integrity, and so on.

Question 10

Q

Fallacy: Latency is zero

Answer

A

It will take at least 30ms to ping from USA to Europe and back because the speed of light is 300,000km/s

Implications:
You may think it is all ok if you deploy your application on LANs, but you should strive to make as few calls over the network as possible (and transfer as much data out in each of these calls) (computing over a high-latency network means you have to “bulk up”)

Question 11

Q

Latency

Answer

A

(not bandwidth) how much time it takes for data to move from one place to another: measured in time

Question 12

Q

Fallacy: Bandwidth is infinite

Answer

A

Constantly grows, but so does the amount of information we are trying to squeeze through it (VoIP, videos, verbose formats like XML, …)

Bandwidth may be lowered by packet loss (usually small in a LAN), we may want to use larger packet sizes.

Implications:
- Compression; try to model/simulate the production environment to get an estimate for your needs (performance modelling)

Question 13

Q

Bandwidth

Answer

A

How much data you can transfer over a period of time (may be measured in bits/second)

Question 14

Q

Fallacy: The network is secure

Answer

A

Assume this is false

Implications:
- Need to build security into applications from day 1
- As a result of security considerations, you might not be able to access networked resources, different user accounts may have different privileges, and so on…

Question 15

Q

Fallacy: topology doesn’t change

Answer

A

The topology doesn’t change so long as we stay in the lab

In the wild, servers may be added and removed often, clients (laptops, wireless and ad hoc networks) are coming and going: the topology is changing constantly.

Implications:
- Do not rely on specific endpoints or routes.
- Abstract the physical nature of your network: The most obvious example is DNS names as opposed to IP addresses.

Question 16

Q

Fallacy: There is one administrator

Answer

Study These Flashcards

A

Unless we refer to a small LAN, there will be different administrators associated with the network with different degrees of expertise.
Might make it difficult to locate problems (is it their problem or ours?)
Coordination of upgrades: will the new version of MySQL work as before with Ruby on rails?

Don’t underestimate the human ‘social’ factor!

Question 17

Q

Fallacy: Transport cost is zero

Answer

Study These Flashcards

A

Going from the application layer to the transport layer (2nd highest in the five layer TCP/IP reference model) is not free:
- Information needs to be serialised (marshalling) to get data onto the wire.

The cost (in terms of money) for setting and running the network is not zero. Have we leased the necessary bandwidth?

Question 18

Q

Fallacy: The network is homogeneous

Answer

Study These Flashcards

A

Even a home network may connect a Linux PC and a windows PC. A homogeneous network today is an exception, not a rule!

Implications:
- Interoperability will be needed
- Use standard technologies (not proprietary protocols), such as XML

Question 19

Q

XML

Answer

Study These Flashcards

A

A W3C recommended general-purpose markup language designed to share data across different information systems.

Drawback is that it’s slow

Question 20

Q

Markup language

Answer

Study These Flashcards

A

Combines text and extra information about the text

Introduction - Week 1 Flashcards

(20 cards)