Distributioned Systems & Networks Flashcards

1
Q

What are the 5 network layers in Tanenbaums books model

A

Application
Transport
Network
Link
Physical

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What headers does the Transport layer add

A

TCP or UDP headers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What headers does the Network layer add

A

IP Headers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What headers does the link layer add

A

It depends on the medium as it will add whatever headers are required for the physical medium

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the purpose of the link layer, from a high level

A

To shield the upper layers from the specific connection type, and then to transmit bits in the form of frames

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What addressing form is used at the link layer

A

MAC (Medium Access Control Address)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What IEEE standard defines transmission for WIFI

A

IEEE802.11

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

There are a large number of physical medium types, name some

A

Important ones:
+ Coaxial Cable
+ Twisted pair
+ Power line
+ Fibre optic
+ Wireless
Minor ones:
+ Laser
+ Sound
+ Ultrasonic
+ Pulses
+ Radar

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the function of the link layer

A

The link layer:
+ Transmits frames over physical media, encapsulating IP Datagrams into link layer frames
+ Receives frames and parses the IP datagrams up the stack
+ Detects and handles transmission errors
There are many standards which have been defined and have evolved over the years

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

When being encapsulated by the link later, what do packets from higher levels become and get referred to as

A

Packets get encapsulated into FRAMES as PAYLOADS, they are prepended with headers and appended with trailers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What level of encapsulation differs based on the type of the physical layer

A

Data frames vary depending on the physical layer, for example, an ethernet frame will have a different form than a WiFi or a Fibre frame

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the purpose of flow control

A

Flow control regulates the flow of data to avoid swamping slow receivers from fast senders

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are two major methods of flow control at the link layer

A

+ Messages sent to the sender saying more data can be sent
+ Rate based with an agreed speed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the three general link layer models, giving an example for each

A

+ Connectionless, no acknowledgements - i.e. wired ethernet
+ Acknowledged, connectionless service - i.e. Wifi (IEEE802.11)
Acknowledged, connection-oriented - i.e. satellite

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What does it mean for a link layer model to be called Connectionless, no acknowledgement

A

Connectionless means that no signalling path is established in advance.

No acknowledgement means frames are sent and they may or may not be received by the destination.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What does it mean for a link layer model to be called an Acknowledged Connectionless service

A

This is used to allow frames to be sent without first setting up a connection, and then allows the acknowledgement of these frames

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What does it mean for a link layer model to be called an acknowledged connection-oriented service

A

This is used to allow a connection to be established between two machines before frames are sent, these frames can then be acknowledged when they are received

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What does ARQ Stand for?

A

Automatic repeat reQuest.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is stop and wait ARQ

A

Stop and wait ARQ (Automatic Repeat reQuest) is a link layer ACK handling strategies which sends a frame and waits for an ACK before sending the next frame. If it does not receive an ACK then the frame is re-transmitted

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

At the link layer, when is an ACK not sent?

A

An ACK is not sent if the frame is lost or damaged (checksum doesn’t compute)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

At the link layer, what is Go-Back-N ARQ

A

Go-Back-N ARQ is a method of handling acks which will send multiple frames (up to the window size) before it receives the first ACK, it uses sequence numbers on the frames to ensure it gets frames in the correct order.

If a frame is received out of order it gets discarded, and an ack is then sent for the last correct, in order frame and the sender re transmits from that point

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is selective repeat ARQ at the link layer

A

Selective repeat ARQ is similar to Go-Back-N ARQ, however it only re-transmits lost frames meaning that it is acceptable for frames to be received out of order and buffered

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is the job of error detection and correction at the link layer

A

To detect errors and provide a line “free of errors” to the network layer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is parity bit

A

Parity bit is an error detection method where a bit marks “is the number of 1’s even or odd”, but this does not reveal all errors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What is CRC
CRC, or Cyclic Redundancy Check, is a error detection method which holds a checksum field in the frame, this is calculated on both ends and is compared This can be used on the link layer and other layers - for example, IPv4 packets hold such a checksum
26
What is forward error correction
Forward error correction includes error-correcting codes in frames which can be used to detect AND fix errors in the transmitted data
27
Does Ethernet use ACKs, why
No, because it would be unnecessary overhead when ethernet is already reliable
28
What does the process of framing refer to at the link layer
Framing the process is adding indicators of where the frame starts and ends to the link layer data bits
29
What is one approach to framing at the link layer, explain
Using FLAGS + Using a predefined flag, append the FLAG byte value to mark the start and end of the data + If the FLAG occurs in the actual data, use an escape byte + When receiving ignore the first escape byte
30
What does the MAC protocol manage
Access to and from the physical medium, this is typically specific to the type of physical layer
31
What does CSMA/CD Stand for and what is its purpose
CSMA/CD Stands for Carrier Sence Multiple Access with Collision Detection It is used when using a single shared media in order to ensure that only one one sender is transmitting at any time
32
How does CSMA/CD work
Sender listens to see if the media is busy, if it is wait When the channel is free transmit, while you transmit, listen and stop if a collision occurs Back off before retransmitting if a collision is detected
33
How does exponetial back of in CSMA/CD work
When a collision is detected for the first time, back off for a random amount of time between MIN and x. When you transmit again, if another collision is detected increase x (i.e. double it) and wait this random amount of time again
34
Why doesn't WiFi use CSMA/CD
WiFi doesn't use CSMA/CD because WiFi devices generally can't send and listen at the same time. There is also the "hidden node" problem in which two devices can see the access point but not each other
35
What does WIFI use instead of CSMA/CD, and what does it stand for
CSMA/CA Carrier Sence Multiple Access with Collision Avoidance
36
How does CSMA/CA work with RTS/CTS
When a frame needs to be transmitted wait until the channel is idle. When it's free transmit a Request to Send, if a Clear to Send is received back transmit the data, otherwise wait a random back off time
37
What TCP/IP layer is called the network layer by the OSI model
The Internet layer
38
What OSI Layer is called the internet layer by the TCP/IP model
The network layer
39
What is the layer above the link layer
The network/Internet layer
40
What does the Internet layer provide
Unique addressing and next hop routing
41
What header does the internet layer add to a packet
The IP Header
42
Name the three key properties of the internet protocol
Packet switched - Its connectionless Unreliable - Packets are sent on a best effort basis Routed - Routers use a routing table to determine the path
43
What process does store and forward packet switching refer to
Store and forward packet switching refers to the process of sending a packet to the nearest router, which should then parse it onto a router from its routing table. This should result in packets eventually arriving at the end process.
44
What do Quality of Service methods do
They prioritise certain traffic
45
What does fragmentation refer to
Fragmentation is the process of spliting Protocol Data Units into smaller Protocol Data units for transmission so they are below the MTU value
46
What Does MTU stand for and what is it
MTU is the "Maximum Transmission Unit" and is the maximum size packet which the physical connection is able to transmit. Protocol Data Units which are larger than the MTU must be split into smaller packets
47
What are the 6 main protocols which operate at the internet layer
Internet protocol: + IPv4 + IPv6 Control and diagnostic protocols: + ICMP + ICMPv6 Encryption and Security + IPSEC Establishing of IPv4 multicast groups + IGMP
48
How can we simplify an IPv6 address
Omit the leading 0s Replace a single set of repeated 0 blocks with ::
49
Simplify 2001:0630:00d0:f500:0000:0000:0000:0064
2001:630:d0:f500::64
50
How does fragmentation differ between IPv4 and IPv6
IPv4 allows packets to be fragmented at any routing hop, while IPv6 only allows packets to be fragmented by the sending host This means, if needed, IPv4 packets can be re-assembled at intermediate routers It also means that IPv6 must use Path MTU discovery before sending
51
What does the following tell you about the network, and what is the /48 called: 2001:650:d0::/48
That the first 48 bits are common to the network, /48 is called the prefix length
52
What does the following tell you about the network, what is the /16 called, and how can it also be represented 152.78.0.0/16
The first 16 bits are common to the network, this is called the subnet mask This can also be represented ass 255.255.0.0
53
What was the original three classes of IPv4 address allocation, and why was it inefficient
Class A: /8 prefix with 16 million addresses Class B: /16 prefix with 65000 addresses Class C: /24 prefix with 256 addresses This was inefficient as, say you needed 258 addresses, you would get a /16 allocation with 65000 addresses consuming large ammouts of address space
54
What does CIDR stand for and what did it allow
CIDR stands for Classless Inter Domain Routing and allows for variable length prefixes to be used (in place of the 3 original allocation classes). This helped reduce IPv4 address consumption
55
What does subnetting allow
Subnetting allows us to limit the propagation of ethernet broadcast traffic across a network and put hosts into segments This allows for larger IP allocations to be logically divided by, for example, buildings
56
For a /24 IPv4 subnet, how many total addresses are there and how many are available
256 Total addresses 253 are usable as .0 or .255 are reserved, and one (often .1 or .254) is used for the router
57
Given the IPv4 allocation of 152.78.70.0/23 and the need to make one subnet for 200 devices and two others for 100 devices each what is a possible allocation scheme
200 devices require a /24 subnet mask (max), giving the subnet 254 addresses 100 devices require a /25 subnet mask (max), giving the subnet 126 address So one example is 152.78.70.0 /24 (152.78.70.1 -> 152.78.70.254) 152.78.71.0 /25 (152.78.71.1 -> 152.78.71.127) 152.78.71.128/25 (152.78.71.129 -> 152.78.71.254)
58
What is the smallest IPv6 prefix for hosts
/64 as there is no real need to go bigger or smaller
59
When is a router needed in a network
A router is needed any time there is a change in the address space
60
What is RFC 1918
RFC 1918 is the name of the memo which defines the private address space. These are IPv4 addresses which are for internal use within networks and are not globally routable.
61
What are the 3 address spaces defined in RFC 1918
10.0.0.0/8 with 16 million addresses 172.16.0.0/12 with 1 million addresses 192.168.0.0/16 with 65000 addresses
62
What does NAT stand for and what is its purpose
NAT stands for Network address Translation, although it is commonly also used to refer to Network Address and Port Translation (NAPT) NAT allows one global IPv4 address to be shared between multiple hosts, for example, a home network will have 1 IPv4 from the ISP and then use RFC1918 internally
63
What does CGNAT stand for and what does it allow
Carrier Grade NAT allows sharing global addresses between customers who get private addresses from a special range Customers then NATs that address to RFC1918 This should be done with the 100.64.0.0/10 block, but some abuse RFC1918
64
What does routing describe
Routing describes how packets should move between different subnets
65
At what layer is routing considered
The Internet/ Network layer
66
What two possible places can a host send a packet too
Directly to a destination if its on the same local subnet A router
67
How would you signify that the first 64 bits of a IPv6 address identifies the subnet
/64
68
How would you show a subnet mask of 255.255.255.0 in CIDR notation
/24
69
Where is the information to build a routing table taken from
DHCP or IPv6 RA
70
What does a routing table include
Destination IP prefixes and the interface or next hop to use The local subnet which the host is connected to To default route
71
What routing in the routing table will be picked first
Thr route with the longest prefix will always be picked first, if two routes have the same prefix then that with the lowest metric is picked first
72
What does prefix aggregation allow for
Prefix aggregation allows subnet prefixes to be aggregated with those of adjacent subnets
73
What do routing protocols allow for
Routing protocols allow for routers to create their own routing tables
74
What is an Autonomous System (AS)
An AS is a large network or group of networks with a unified routing policy, these make up the internet
75
What is an ASN
An ASN is an Autonomous System Number and is assigned by a Regional Internet Registry, each AS needs its own ASN
76
What are the three general categories of Autonomous Systems
+ Multihomed + Transit + Single-homed/ stub
77
Where are Interior gateway protocols used
Interior Gateway Protocols are used within an Autonomous System, such as within a corporate network
78
Where are Exterior gateway protocols used
Exterior gateway protocols are used between autonomous systems
79
What are two types of Interior Gateway Protocols
+ Distance Vector - Talk only to neighbouring routers. + Link state - Talk to all routers on the network.
80
How do Distance Vector Interior Gateway Protocols work
Each router talks only to directly neighbouring routers They then exchange the best route information for any known prefixes with direct neighbours
81
How to Link State Interior Gateway Protocols work
Each router talks to all other routers to establish full knowledge of the routers and topology in a site Routers flood information describing their connected neighbours around the entire site network
82
RIP Is an example of a routing protocol. What are 2 of its limitations
Metrics are simple hop count values limited to 15 Updates are not acknowledged Updates are only sent every 30 seconds Routers don't have knowledge of the network topology Authentication is MD5 which is broken
83
What are the 3 steps for link state routing
1. Discover neighbours and determine the cost metric 2. Flood messages with this information to all routers 3. Use received messages to build topology, computing shortest paths for prefixes served by any router These messages are sent periodically or when a change in connectivity is detected
84
What is the advantage of Link State Routing over Distance Vector
Link state converges faster, allowing changes of topologies to be detected in seconds Link state is better at avoiding loops as every node knows everything
85
How does inter site routing between AS's work
AS's advertise their network prefixes to neighbouring networks AS's can also offer to transit to other AS's
86
BGP is a Distance Vector like exterior routing protocol, what additional information prevents loops
As the path is sent when routes are advertised we can detect and prevent loops
87
What are the three downsides of BGP
+ BGP Relies on trues + BGP Is too slow and takes a lot of effort to update + Routers have limited BGP Routing table sizes
88
Where does a host on a LAN or Subnet send packets (To IP's not in its subnet)
A Default Router
89
What does an enterprise site network use to connect subnets
A Routing Protocol
90
What does UDP stand for
User Datagram Protocol
91
What does TCP stand for
Transmission Control Protocol
92
At what network layer do UDP and TCP function
The Transport Layer
93
What Transport Layer protocol supports Multicast
UDP
94
How is a TCP connection established
A three-way handshake is used: + SYN is sent by the client with random sequence numbers + SYN-ACK is sent back by the server + ACK is sent by the client meaning a connection is established
95
What protocol is used by TCP to control the sending rate, how does it work
Sliding window protocol: + The receiver has a limited incoming buffer size + The sender should not send data unless the receiver indicates it has space to receive + Otherwise the packet would need to be resent later
96
What Transport layer protocol should be used for web streaming ( not live)
TCP - this allows for buffering ahead of the video
97
What does ICMP stand for, and what is it used for
Internet Control Message Protocol is used in both IPv4 and IPv6 for information and error messages For IPv6 only it is also used for router advertisement and neighbour discovery
98
What is multicast
Multicast is One to Many communication, packets are only sent to hosts who are interested in them This is required for IPv6 and an add-on for IPv4
99
What is ARP?
Address Resolution Protocol is used to map an IPv4 address on the local subnet to a MAC address The host looking for a MAC address broadcasts an ARP "who has request" and the target sends a unicast reply to the requestor
100
What does DHCP stand for and do
Dynamic Host Configuration Protocol automates the process of Address configuration for IPv4
101
What is NDP
Neighbour Discovery Protocol Maps IPv6 addresses on the local subnet to Mac addresses It uses ICMP and multicast
102
What are the steps for DHCP
When a host connects to a network it broadcasts DHCP DISCOVER The DHCP server reserves an address and replies with a DHCP offer The client then needs to DHCP REQUEST the address The server sends a DHCP ACK containing the lease duration and config
103
What does SLAAC stand for and allow
StateLess Address AutoConfiguration allows a host to autoconfigure basic network settings without a DHCPv6 Server. The RA specifies whether this should be used or not
104
How does a host using SLAAC build its address
A 64 bit prefix determined from a router assignment A 64 bit generated host segment
105
How was the host segment for SLAAC originally generated, and what was the drawback of this
Originally the hosts MAC address was padded and used This was a privacy nightmare however as hosts could be tracked across subnets
106
What is the IPv6 equivalent for DHCP
DHCPv6, this uses DHCP Unique Identifier instead of MAC addresses, using Solicity, Advertize and Reply This can work at the same time as SLAAC
107
What is the job of an ethernet switch
To decide whether to forward the frame, and if so which port to forwards it to. The switches lean the ethernet MAC addresses of hosts seen on each switch interface or port
108
Given a datagram from the network layer, how does an ethernet link layer determine the destination's MAC address
A Link layer broadcast message is sent asking "who has this IP address", this is seen by all hosts on the same ethernet LAN and the host with the target IP responds.
109
What is a Broadcast Domain and what is its purpose
A broadcast domain is a method of restricting the number of global broadcasts which to stopfrom them flooding the network We can do this by splitting the domain up with routers which will not broadcast link-layer broadcasts.
110
What is the purpose of spanning tree algorithms in switches
Spanning tree algorithms allow physical loops to be made without breaking the network, determining least cost paths to the root and finding best paths
111
What is the purpose of a Virtual LAN
Virtual LAN's can use identifiers in the Ethernet frame to create multiple VLANs in one trunked uplink This avoids the need to physically re-cable and can be used to control broadcasts to certain areas
112
What does DNS stand for, do and at what layer does it live
Domain Name Systems allow machines to look up the IP address of hostnames. It lives on the application layer
113
womp
womp
114
What was telnet and what was it replaced with
Telnet was simple unencrypted terminal emulation protocol which was replaces by SSH and SCP
115
What is SMTP
Simple Mail Transfer Protocol is a protocol used to send email where a TCP connection is made to a mail server and
116
What is IMAP
Internet Message Access Protocol is another email protocol which keeps a TCP connection open with the server
117
What is HTTP and what is it used for
HTTP is text based protocol which uses text base messages
118
What is QUIC
Quick UDP Internet Connections, this is a UDP protocol which allows TCP connections to be made over UDP
119
What is CoAP
Constrained Application Protocol provides a HTTP like protocol for simpler devices with minimal overhead It uses binary GET/PUT etc commands making messages small and uses simple subscription methods
120
What is MQTT
MQTT is a hierarchical protocol where messages get published to brokers and then get shared with any clients who are subscribed to the data streams All data is raw to avoid overhead
121
What are the two most common DNS Record Types, and what do these store
AAAA - IPv6 records A - IPv4 records These are used to store host name to IP conversions
122
What are the two modes for DNS servers
Iterative - This means that the server will respond with a referral to another server Recursive - This means that the server will respond from the local cache or resolve the query before responding
123
What would a DNS server query if it could not resolve a hostname
One of the root name servers
124
What does any cast allow
Any cast allows clients to reach the nearest instance of a service, this means the same IP can be used at multiple points, and routers will learn of the nearest instance
125
How is quartz used to measure time
As quartz resonates at a precise frequency we can count the number of oscillations to measure time
126
What is the difference between physical clocks and logical clocks
Pysical clocks count the number of seconds which have parsed while logical clocks count the number of events
127
What is Christian's algorithms purpose
To synchronize time with a time server
128
How does Christians Algorithm work
1) The client sends a request to the clock server for the time 2) The clock server responds by returning the clock server time 4) The client process receives the response and uses it to calculate the synchronized client clock time The new client time is the server time plus half the round trip time
129
What are the steps for Berkeley's Algorithm
1) Use a leader election process to chose a co-ordinator node 2) The co-ordinator requests the time from each node 3) The co-ordinator should use cristians algorithm to fetch the time from each node 4) The coordinator should calculate the average time difference and add it to the current time of the co-ordinator's clock 5) Broadcast the co-ordinators current time over the network
130
What steps can be taken to improve Berkeley's algorithm
1) Ignore significant outliers when calculating the average time difference 2) I second leader should be pre-chosen incase the coordinator fails or corrupts 3) Broadcast the relative inverse time difference instead of the synchronized time
131
What is stratum 0 in the Network Time Protocol
Stratum 0 is the level assigned to the highest precision clocks
132
What are the steps for a client to work out its round trip network delay from the NTP server
1) Client sends (T_0, _, _, _) 2) Server receives and adds T_1 (T_0, T_1, _, _) 3) Server sends and adds T_2 (T_0, T_1, T_2, _) 4) Server receives and adds T_3 (T_0, T_1, T_2, T_3) The round trip network delay can then be worked out as (T_3 - T_0) - (T_2 - T_1)
133
Having calculated the average round-trip network delay a number of times, what can the client do to correct its clock
Slowly adjust the clock (slewing) Reset the clock (stepping) Panic
134
What does a -> b mean when discussing events
event a happens before event b
135
When can we say a -> b given a and b are events
When ONE of the following is true: + When a and b occured at the same node and a occurred before b in that nodes local execution order + When a is the sending of some message m and b is the receipt of the same message m + When there exists an event c such that a -> c and c -> b
136
What are events a and b considered to be when neither a -> b or b -> a
In this case a and b are concurrent, this is written as a || b
137
What do logical clocks count
The number of events which occur
138
What does it mean for a to have a causal dependency on b
If a has a causal dependency on b then a might have had some role in causing b (e_1 -> e_2) => (T(e_1) < T(e_2))
139
For the Lamport clock algorithm, when is t changed
When an event occurs or a message is sent, t is incremented by one If a message is received the t value is set to max(message_t, current_t) + 1
140
For Lamport clocks, what does L(e) refer to
L(e) refers to the value of t after an increment caused by event e
141
What can we deduce when using Lamport clocks from a->b
if a->b then L(a) < L(b)
142
What are the limitations of Lamport Clocks
With Lamport clocks and L(a) < L(b) we can't tell which happened before or whether the events are concurrent
143
What form does the timestamp of an event a take when using vector clocks
v(a)=
144
When using vector clocks, what does a node N_i increment in the vector
Node N_i increments T[i] by one when an event occurs at it
145
When using vector clocks, on receiving the message timestamp pair: (T', m) at node N_i,how do we update our nodes timestamp
For each j in 1 to n T[j] := max(T[j], T'[j]) + 1 T[i] := T[i] + 1
146
What is Mutial Exclusion
Mutual exclusion is the name given to the process of concurrency control when running multiple processes in parallel
147
What does the Critical Section refer to
The critical section is the name given to variables or resources which can be accessed by more than one code segment
148
Why is correct control of the critical section important
Because the variables in it need to remain consistent, and the value depends on the sequence of execution of instructions
149
How many processes can exist in the critical section at the same time
0 or 1
150
What are the 4 requirements for mutual exclusion
+ No Deadlocks + No Starvation - Every site which want to execute in the critical section should get the chance to + Fairness - Every site should have a fair chance to execute in the critical section + Fault Tolerance
151
What do safety properties ensure
Safety properties ensure bad things don't happen
152
What do liveliness properties ensure
Liveliness properties ensure good things do happen
153
What does a fairness property ensure
Fairness properties ensure that access to the critical section is done in a fair order (duh)
154
What does a performance propertys do
They: + Minimise the number of messages sent to each entry or exit op + Minimise client delay when entering or exiting the critical section + Minimise synchronization delay between when a process exits the critical section and the next one enters
155
What are the three general forms of mutual exclusion algorithms
Token Based Algorithms Non Token Based Algorithms Quorum based approach
156
What are the properties of token-based mutual exclusion algorithms
+ A Unique token is shared among all sites + A site can only enter the critical section if it has this token + Sequence numbers are used to order requests
157
What is the process behind the centralised mutual exclusion algorithm
With this one node is chosen as the coordinator, when a node wants to enter the critical section must request the token from the coordinator, and only get given the token once its been released by the last process to use it
158
What is the process behind the Ring Token Mutual Exclusion Algorithm
Each site in the system has a queue of tasks which it needs the critical section for, it also has a pointer to the next site in the ring The token is cycled around the ring, when held by a site the site can complete one of the tasks from its queue before handing the token to the next site in the queue
159
Is the ring token algorithm fair
No, because order is based on the position of the token and the shape of the ring
160
What is the big drawback of centralised algorithm
The server acts as a single point of failure and bottleneck
161
What are the setup requirements of Raymonds tree based algorithm
Each node should be given a parent A child node can only send requests to its parents Each node has a FIFO queue of requests If any node is forwarding privilege to other nodes and has a non-empty queue, it forwards a request message
162
What are the steps of Raymonds Tree Based Algorithm
1. If a node i (not holding the token) wants to use the token, so that it can enter the critical section, it sends a request to its parent, node j. – If node j FIFO queue is empty, node j shifts i into its FIFO queue; j then issues a request to its parent, k, that it desires the token – If node j FIFO queue is not empty, it simply shifts i into the queue 2. When node k has token and receives the request from j it sends token to j and sets j as its parent 3. When node j receives the token from k, it forwards the token to i and i is removed from the queue of j – If the queue of j is not empty after forwarding the token to i, j must issue a request to i in order to get the token back
163
What node is the root at a given time in Raymonds Tree Algorithm
Whichever node is holding the token is the root
164
165
What do Non Token based algorithms for mutual exclusion use in place of sequence numbers
Non Token based algorithms use timestamps using logical clocks
166
What does a request message in the Ricart Agrawala Algorithm include
The site ID and the timestamp
167
What do all sites in the Ricart Agrawala Algorithm have
A queue and a site ID
168
What are the steps for a site i (Ricart–Agrawala) to enter the critical section
Send a broadcast to all sites containing the site ID and timestamp All sites which receive this will add it to the queue and respond if and only if a) the receiving process is not currently interested in the critical section b) The receiving process has a lower priority based on the timestamp The requesting site waits until all sites have replied before it uses the critical section When the requesting site has exited, it can send any deferred response messages
169
What does each node have in Maekawa's algorithm, and what is the initial sate of these values
Queue defaulting to [] Boolean voted = TRUE | FALSE defaulting to FLASE state = WANTED | RELEASED | HELD defaulting to RELEASED
170
What is a request subset in Maekawa's Algorithm
The request subset is the set of processes that a process must request permission to enter the critical section from
171
For Maekawa's algorithm, what does P_i do if it wants to enter the critical section
Set state to WANTED Multicast "request" to all processes in R_i Wait until k reply messages are received Set state to HELD
172
For Maekawa's algorithm, what does P_i do if it wants to exit the critical section
Set state to RELEASE Multicast release to all processes in R_i
173
For Maekawa's algorithm, what does P_j do if it receives a request from P_i
If the state is HELD or voted is TRUE then Queue request Else Send a reply to p_1 and set voted to TRUE
174
For Maekawa's algorithm, what does P_j do if it receives a release from P_i
If the Queue is empty then set voted to false else pop the head of the queue and call it p_x Send p_x the reply Set voted to true
175
What can be done to make Maekawa's algorithm Fair
Use vector clocks instead of lamport clocks
176
What is the entry and exit performance of Maekawa's algorithm
2 sqrt(N) for entry and sqrt(N) for exit
177
What are three use cases for leader elections
1) When each process is roughly the same 2) When the cluster is performing a complex task which requires close collaboration 3) When the system executes many distributed writes to a disk and requires good consistency
178
For the Ring Leader Election Algorithm, what does a process do when it determines an election is required
It marks itself as a participant and sends a election message with its identifier to its left neighbour
179
For the Ring Leader Election Algorithm, What does a process do when it receives an election message with an identifier GREATER than its own
It forwards the message
180
For the Ring Leader Election Algorithm, What does a process do when it receives an election message with an identifier LESS than its own
IF it is currently a non-participant THEN It marks itself as a participant It substitutes its own identifier It forwards the election message ELSE It does nothing
181
For the Ring Leader Election Algorithm, What does a process do when it receives an election message with an identifier EQUAL TO its own
It declares itself a leader It marks itself as non-participant It sends an elected message with its identifier to its left neighbour
182
For the ring leader election algorithm, What does a process do when it receives an elected message
It marks itself as non-participant an notes the identifier of the leader from the message If the identifier is not its own it parses the message on If the identifier is its own it can act as leader
183
What is the best case total messages for the Ring leader election algorithm, how is this split up and when does it happen
N election messages followed by N elected messages Total messages: 2N messages This happens when the node which initiated the election has the highest identifier
184
What is the worst case total messages for the Ring Leader election Algorithm, how is this split up and when does it happen
N-1 election messages (to get to the highest ID), followed by N election messages of ID N, followed by N elected messages Total Messages: 3N-1 This happens when the node to the right of the initiator has the highest ID
185
For ring Leader election, What happens if multiple processes initiate elections concurrently
The participant flag triggers a process to stop election messages with lower identifiers If every process started an election at the same time, and identifiers in the ring are ordered, N-1 additional elections would take place causing N(N-1)/2 additional messages to be sent
186
What are the message types for the Bully Election Algorithm
Election message to announce an election Answer Message to respond to election messages Coordinator message sent by the winner of an election to announce victory
187
For Bully Election: What does a process P do if it notices the current coordinator has failed and it has the highest ID
It broadcasts a Coordinator message notifying that it is the coordinator now
188
For Bully Election: What does a process P do if it notices the current coordinator has failed and it does not have the highest ID
It sends an Election message to all processes with higher IDs than itself If P receives an answer from a process with higher ID than itself it sends no further messages and waits for the coordinator message If there is no answer after sending the election message within a period of time, then the process broadcasts a Coordinator message and becomes the coordinator
189
For Bully Election: What happens if P receives an election message from a process with a lower ID
It sends an answer message back, if an election is not already started it will start the election message at the beginning by sending election messages to higher-numbered processes
190
For Bully Election: What is the best case number of messages and when does this occur
The best case occurs when the failure of the coordinator was detected by the process with the second highest identifier, this can elect itself as the leader in one message One message
191
What is the worst case total number of messages for Bully election and when does this occur
The worst case occurs when the failure of the coordinator is detected by the process with the lowest identifier triggering all other processes to begin elections This sends N(N-1) messages (or O(N^2)), giving a turnaround time of O(N) as two messages need to be sent, (and a timeout)
192
What safety issue arises with bully elections
If a process fails during a bully election, there may end up being two coordinators
193
Can Bully Elections tolerate multiple elections
Yes
194
What are the three inherent characteristics of Distributed Systems?
Concurrency No Global Clock Independent Failures
195
What is the baseline physical model of a distributed system
Components of they system are located at computers Computers are interconnected by a network Components communicate by passing messages over the network
196
What are the elements of Distributed systems from a system-oriented perspective
Processes - which communicate through inter-process channels Nodes - which are used if process abstraction is not available Threads - which are used as lightweight execution means
197
What are the elements of distributed systems from a problem-oriented perspective
We can abstract problems for distributed systems to consider Objects which we can share around the system and model the entitys of the problem Components, of the system, which use Objects Web services which provide a means of accessing the system
198
What are the methods for inter-process communication in a distributed system
Message Passing Sockets
199
What are three methods for executing code on a distributed system ## Footnote Remote invocation methods for a distributed system
+ Request reply protocols such as HTTP + Remote Procedure Call which request the execution of a remote function and receive the response + Remote method invocation - Matched with problem-oriented communication entities these are an object-oriented version of remote procedure call
200
When mapping a service over multiple servers, what are the two methods of splitting data objects over the servers
Replication Partitioning
201
What are the pros and cons of replication of data across multiple servers
Pros: Better fault tolerance and load balancing Cons: There a need to ensure data consistency
202
What are the pros and cons or partitioning of data across multiple servers
Pros: No overhead for data consistency Cons: No real fault tolerance
203
What can a Proxy Server provide to a distributed system
A proxy server is able to provide caching of recently used data objects closer to the client
204
What are the 5 key architectural patterns for distributed systems
+ Layering + Tiering + Thin clients + Proxying + Brokerage
205
What is the purpose of the layering architectural pattern
To hide details of software application from the higher levels of a system
206
What is the purpose of tiering
Tiering organises separate layers of functionality onto separate servers, for example clients could connect to tier 2 application servers which can then connect to teir 3 database servers
207
What does consensus refer to in distributed systems
Consensus refers to the process of reaching an agreement among piers in a distributed system
208
What are the three requirements for consensus algorithms
All processes must eventually terminate All processes must agree on the same value It must be done with integrity -- If all correct processes propose the same value or action then any correct process that has decided must choose that value or action.
209
How could consensus be implemented with total order multicast
1. Each process can propose its value for consensus and multicast it to all other processes 2. Upon receiving proposals each process can compare the proposed values and attempt to converge towards a single value based on rules or conditions 3. Through repeated rounds of this proposal and communication processes are able to converge towards a consensus value which satisfies the criteria of the consensus problem
210
What are the three forms of communication model (Distributed systems)
Unicast: A message is sent from one sender to one recipient Broadcast: A message is send from one sender to all recipients in the network Multicast: A message is sent from one sender to a specific group of recipients
211
What two forms of multicast are available for distributed systems
Application Layer multicast Network-assisted multicast
212
What IP standard is best for Multicast traffic
IPv6 is best for multicast traffic as the specification was designed with it in mind, whereas IPv4 had it as an afterthought
213
Is IP Multicast reliable
No, multicast messages may not arrive in the order they were sent Messages may also fail to reach some or all of the intended recipients
214
What are the 4 forms or reliable multicast
+ FIFO multicast + Causal multicast + Total order multicast + FIFO-Total order multicast
215
What is FIFO Multicast
If m1 and m2 are broadcast by the same node and broadcast(m1)->broadcast(m2) then m1 must be delivered before m2
216
What is Causal multicast
If broadcast(m1)->broadcast(m2) then m1 must be delivered before m2
217
What is total order multicast
Total order multicast means that if m1 is delivered before m2 on one node, then m1 must be delivered before m2 on all nodes
218
What is FIFO-Total order multicast
FIFI-Total order multicast is a combination of FIFO multicast and total order multicast
219
What are heathers slides
Shit
220
What do broadcast algorithms ensure
Broadcast algorithms ensure that messages are reliably sent to all nodes, that every node receives the message and that the messages are delivered in a specific order
221
What is the idea behind the eager reliable broadcast strategie and whats its big drawback
That each time a node recives a message for the first time it rebroadcasts it to each other node via reliable links This is reliable but can cause up to O(n^2) messages for n nodes
222
Describe basic multicast
Basic multicast allows a single sender to transmit data to multiple recipients at the same time, this guarantees that if the multi caster does not crash a process will eventually deliver the message This can have open or closed groups, where a closed group means that only group members can multicast
223
What is one instance when basic multicast can fail
If the sender crashes in the middle of the send loop then only some of the processes will receive the message
224
What is the primitives Basic Multicast provides
B-multicast(g, m) - For each process p in group g send message m using send(p, m) B-receives(m) - deliver the message to the process
225
How does R-Multicast work to send message m to all processes in group g
1) Use B-Multicast(g, m) to send the message to all processes in the group including itself 2) When a message m is received for the first time each recipient must multicast m to the group 3) After multicasting, deliver m to the process
226
What does the trait of integrity for reliable multicast require
That each non faulty process delivers message m at most once
227
What does the trait of agreement for reliable multicast require
That if one correct process delivers m, all others in the group will eventually do so too
228
What does the trait of validity for reliable multicast require
That any correct process multicasting m will eventually deliver it, this ensures sender livelyness
229
Is R-Multicast considered reliable, if so why
Yes, as it has: + Integrity (as it detects duplicates) + Validity (as a correct sender will eventually R-deliver a message) + Agreement (as if a sender crashes before delivering the message either it didn't get sent to anyone or was received and will be re shared by one)
230
What does R-Multicast not guarantee
R-Multicast provides no guarantee for the order of delivery
231
How does the ordered multicast FIFO Algorithm work
Each process keeps a sequence number for each of the other processes. When a message is received, we check the message number, + if this is as expected we accept and increment + If this is higher than expected we buffer it in a queue + If it is lower than expected we reject
232
For R-Multicast, how many B-Multicast messages are sent for one R-Multicast Message
O(N^2) B-multicast messages
233
What does FIFO multicast store
S^p_g holding the number of messages p has sent to g R^q_g holding the number of the latest group g message from q A hold back queue of messages
234
How does a process p send a FIFO Ordered multicast m to g
p increments S^p_g by 1 p piggy-backs the value of this onto the message p B-multicasts this message m to g
235
What does a process p do when it receives a message m from q with a sequence number of S
GIVEN R^q_g is the sequence number of the latest group g message delivered from q IF (S == R^q_g + 1) THEN p FO-delivers m and increments R^q_g ELSE IF ( S > R^Q_g + 1) THEN p places the message in the hold back queue until the messages which should have been received are delivered
236
What form of timestamps are required for causal ordering, what do they count
Vector timestamps These count the number of multicast messages already delivered from each process
237
When sending a Causal ordered multicast, what is done to the timestamp before its sent
The processes entry in the timestamp is incremented
238
What is required for a totally ordered multicast, what value does this maintain
One dedicated sequencer, which is responsable for giving an auto-incremented sequence number for each total order multicast message This maintains a S_g value which is the current sequence number in group g
239
What does a group member need to do to send a message m to group g, what do the group members and the sequencer then need to do
SENDER: Add an id `i` to m and B-multicast to both g and the sequencer GROUP RECIPIENT of : Add to the hold back queue SEQUENCE RECIPIENT of : B-multicast(g, <"order", i, S_g>) and then increment S_g GROUP RECIPIENT of <"order", i, S>: Wait until is in the hold back queue and S=R_g, when this is the case deliver the message to the process, delete it from the holdback queue and increment counter R_g
240
If F processes may crash, how many are needed to form a consensus, and how many rounds may be needed
F+2 processes as we need at least 2 to form a consensus F+1 rounds as their may be failures during the rounds
241
For a system with F CRASH failures, how many machines are required for consensus
N >= F+2
242
For a system with F MALICIOUS FAILURES, how many machines are required for consensus
N > 3F
243
What is the idea behind Primary-Backup replication
Clients send requests to primary servers which are connected to the backup server, when a primary server wishes to execute an action it also executes it on the backup servers
244
What does C, A, and P stand for in CAP Theorem
Consistency - All replicas of the same data object always have the same state Availability - Requests are always served so long as at least one server is available Partition tolerance - The data store keeps working even if servers are partitioned
245
What does a CP System ensure
A CP system ensures that data is always consistent and there is a high tolerance for failures, this can result in actions such as concurrent read write failing
246
What does an AP system ensure
AP Systems only garuntee eventual consistency, asynchronus periods mean that read operations can return non consistent values
247
What does a CA System ensure
CA systems give up partition tolerance. This means they assume that partitions cannot occur and everything is on one partition
248
What is the goal of having loose coupling between web services, and how should this be achieved
The goal of loose coupling is to minimise the dependencies between different web services It can be achieved by focusing on interfaces rather than implementation, designing simple and generic interfaces, and by opting for asynchronous communcation
249
What are the two communication paradigms for web services
Asynchronous - good for time consuming operations where replies can be received later on Synchronous - Good for interactive, fast operations
250
What does SOAP Stand for and define
Simple Object Access Protocol defines how synchronous and asynchronous interactions over the internet should take place, how to use XML to represent the content of messages and how messages should be exchanged
251
For how long should a REST session span
One request, as REST should be stateless
252
What is a transaction in a distributed system
A transaction is a set of related, sequential operations which need to be executed atomically
253
What are the ACID Properties
Atomicity - a transaction must be all or nothing Consistency - a transaction takes the system from one consistent state to another Isolation - each transaction must be performed without interference from other transaction Durability - After a transaction has completed successfully all of its effects are saved in permanent storage
254
When are two concurrent operations conflicting
Two concurrent operations conflict if their combined effect depends on the order in which they are executed, a read and a read on the same object does not conflict, but a read and a write does
255
What does it mean for two transactions to be serially equivalent
It means that all pairs of conflicting operations in the two transactions must be executed in the same order for all the objects they both access
256
What are the three approaches to concurrency controls
Locks Optimistic concurrency control Timestamp ordering
257
Given each data item is already associated with a version number, what are the three stages for optimistic concurrency control
Read Phase: Read the version of the data item's which need to be modified Validation Phase: Check the data items accessed by the transaction to see if they have been modified by any other transactions since the transaction began, if the check fails roll back Commit Phase: If the validation phase succeeds write the changes with a new version to reflect the updates
258
What are the two main scenarios for replacing client-server with Peer-to-Peer
Replace clients AND servers with peers Replace just servers as peers
259
What does a high level of churn refer to
High level of churn refers to the independent arrival and departure of thousand or millions of peers
260
How are peers within BitTorrent networks incentivised
Peers are incentivised to upload chunks by giving preference to client peers which act as server peers, and chocking those which don't upload enough
261
How does a bit torrent client go about finding a file in the torrent
Use its hash to find the node in the DHT that acts as the tracker for the file + Use this tracker to learn which machines store the file which is being looked for
262