Chapter 2: The Application Layer Flashcards

1
Q

How does a client-server application architecture work?

A

A server is an always-on host with a permanent IP address, typically in data centres.

A client cannot communicate to another client directly; instead it communicates to a server, which forwards the request to the other client.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the peer-to-peer (P2P) architecture?

A

Arbitrary end systems communicate directly with each other. These (termed peers) request a service from other peers, while providing a service in return to other peers.

Peers change IP addresses and are intermittently connected. It still has ‘server’ and ‘client’ processes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a process?

A

A program within a host.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a socket (or API)?

A

A software interface between two application layers, analogous to a letterbox.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How are processes addressed?

A

Each process is associated with a port number. To communicate, that socket’s identifier is the IP address and port.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the four main variables of data transmission that should be considered when choosing investigation transport service protocols?

A

Data Integrity
Throughput
Timing Guarantees
Security.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Give the things that an application protocol defines.

A

Types of message exchanged (request, response).
Message syntax (how fields are delineated).
Message semantics (meaning of information in fields).
Rules for when and how processes send and respond to messages.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Which of the following does TCP provide? Which does UDP provide?

reliable transport
timing
minimum throughput
congestion control
flow control
security
no set-up required

A

TCP: reliable transport, congestion control, flow control

UDP: No set up required

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Why is HTTP is a stateless protocol?

A

It doesn’t maintain any information about the client.
[If a client asks again for an item, the server doesn’t recall that it has already been sent.]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Compare persistent and non-persistent connections.

A

In non-persistent connections, only one object can be sent over each HTTP connection. In persistent, there can be multiple object sent over the server.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the steps in a non-persistent HTTP connection?

A
  1. The HTTP client sends a TCP request to the server.
  2. The server accepts.
  3. The client sends the HTTP request to the server.
  4. The server sends the html file.
  5. The server sends a termination request, terminating the question
  6. The client receives the file, realises that there are objects linked to.
    [Steps 1-5 are repeated for each object].
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the response time for a non-persistent connection, with a base page and 10 objects?

A

1 RTT for TCP initiation
1 RTT for HTTP request and first few bytes of HTTP response to object/file transmission.
Then there is file transmission time.

Thus, we have (2 RTT + file transmission) + 10 * (2 RTT + object transmission).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the response time for a persistent connection, with a base page and 10 objects?

A

1 RTT for TCP set up
1 RTT for HTTP set up
11 * object transmission
= 2 RTT + 11 * object transmission

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Give a standard example of a HTTP
GET request and header line for a non-persistent connection.

A

GET /somedir/page.html HTTP/1.1
Host: www.someschool.edu
Connection: close
User-agent: Mozilla/5.0
Accept-language: en

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are the HEAD and PUT HTTP methods used for?

A

HEAD is for debugging. PUT is used for uploading objects to a directory on a Server.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Give a standard response HTTP message.

A

HTTP/1.1 200 OK
Connection: close
Date: Tue, 18 Aug 2015 15:44:04 GMT
Server: Apache/2.2.3 (CentOS)
Last-Modified: Tue, 18 Aug 2015 15:11:03 GMT
Content-Length: 6821
Content-Type: text/html

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What do the following sample codes represent?
200, 301, 400, 404, 505

A

200 - OK
301 - Moved Permanently
400 - Bad Request
404 - Not Found
505 - HTTP Version Not Supported

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What are the four components of cookies?

A
  1. Cookie header line of HTTP response
  2. Cookie header line in next HTTP request message
  3. Cookie file kept on user’s host, managed by user’s browser
  4. Back-end database at website
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Explain the steps of setting a cookie and using it at a later time.

A
  1. The client sends a HTTP request msg to the server.
  2. The server creates a cookie ID in its backend database.
  3. The server sends a HTTP response with a set-cookie line (and cookie ID)
  4. The client then sends a later message with that cookie ID and the server then knows the user.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Give four uses of cookies.

A

Authorisation
Shopping carts
Recommendations
User session state (web email)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Suppose a browser requests an image from a server with a proxy and cache hit. Describe the exchange.

A
  1. The browser establishes a TCP connection to the proxy server and sends an HTTP request or the object to the proxy server.
  2. The proxy server checks if it has the object and returns it within a HTTP response message.
22
Q

Suppose a browser requests an image from a server with a proxy and cache miss. Describe the exchange.

A
  1. The browser establishes a TCP connection to the proxy server and sends an HTTP request or the object to the proxy server.
  2. The proxy server checks if it has the object and opens a TCP connection to an origin server and sends an HTTP request.
  3. The origin server sends the object within an HTTP response to the proxy server.
  4. The proxy server returns it to the client in a HTTP response message.
23
Q

An institutional network is connected to the Internet with a 15 Mbps link. The average object size 1 Mb and the average request rate from the browsers to the origin servers is 15 requests per second.

Assume the request messages are negligibly small - so don’t contribute to traffic.

What is the traffic intensity on the access link?

A

15 requests/sec * 1 Mb/request * 1/15 s/Mb = 1

(Thus, this may begin growing too fast to handle)

24
Q

An institutional network is connected to the Internet with a 15 Mbps link. The average object size 1 Mb and the average request rate from the browsers to the origin servers is 15 requests per second.
There is also a proxy server in the institutional network, which is a 100 Mbps LAN.

Assume the request messages are negligibly small - so don’t contribute to traffic.

Assume the Internet delay is 2 seconds.

What is the average delay for a requests, if the cache hit rate is 0.4?

A

The traffic intensity is 0.6.

Then:
Cache hit time is approx 0.01s
Cache miss time is approx 2.01s

Thus, time = 0.4 * 0.01 + 0.6 * 2.01 s

25
Q

How does the conditional GET work and why is it needed?

A

It is needed to combat stale data in the web server.

  1. When the proxy server forwards information from the origin server to the client, it stores the last modified date too.
  2. When another browser requests that object later, the proxy issues a conditional GET by sending a GET requests with just a host and If-modified-since header line.

3a. If it has not been modified, an empty entity body is returned and status code of 304 (Not Modified)

3b. If it has been modified, the object is returned with a last modified date.

26
Q

How does HTTP/2 fix the HOL issue with HTTP/1.1?

A

HTTP/1.1 has a Head of Line (HOL) blocking problem: if there is a large object, like a video, due to the persistent connection, all later objects are delayed.
Normally multiple parallel connections are then needed to reduce user-perceive delay.

In HTTP/2, each message is split into small frames and interleave the requests and response message on the same connection.

27
Q

Aside from framing, give 2 other benefits of HTTP/2 over HTTP/1.1.

A

Ability to prioritise messages via weightings.
Servers can send multiple responses for a single client request.

28
Q

Give a benefit of HTTP/3 over HTTP/2.

A

Security
Per-stream flow control
Low-latency connection establishment

29
Q

Give the services that DNS provides.

A

Host aliasing, mail server aliasing, load distribution, hostname to IP address translation.

30
Q

Why is it problematic to have a centralized design for DNS?

A

Single point of failure, overloads from traffic, distant central database, maintenance.

31
Q

Explain how DNS lookups work.

A
  1. The browser gets the hostname from the URL and passes it to the client side of the DNS application.
  2. The DNS client sends a query to the root DNS server, which replies with the top level domain address
  3. The client recursively or iteratively tracks through the top-level, and then authoritative DNS servers hierarchy until a DNS server knows the IP address for it.
32
Q

What is a local DNS server?

A

A DNS server belonging to each ISP that acts like a proxy to send to the DNS hierarchy.

33
Q

What is the format of a DNS resource record for:
type A?
type NS?
type CNAME?
type MX?

A

(hostname, IP address, A, TTL)

(alias, canonical name, CNAME, TTL)

(domain, hostname of authoritative name server, NS, TTL)

(name, name of mailserver, MX, TTL)

34
Q

What is the format for a DNS message?

A

12 bytes for the header section:
- 2 bytes for the query ID
- 1 bit for identifying query (0) or reply (1)
- 1 bit for identifying if the reply is from an authoritative server.
- 1 bit for setting if the client desires the recursive approach for finding records.
- 1 bit for if recursion is available.
- 4 number of fields that tell how much questions, answers, authority and additional info there is.
- Name/type fields for a query
- RRs in response to a query
- Records for authoritative servers
- Additional helpful info

35
Q

How are DNS records inserted in the first place?

A

The domain name is registered at a registrar, a commercial entity that verifies the uniqueness of the domain name, which enters it into the DNS database (and you pay them for this).

You will provide the names and IP address of your primary and secondary authoritative DNS servers.

36
Q

Define distribution time.

A

The time taken get a file from one peer to all others who requires it.

37
Q

What is the distribution time for N peers needing a file of F size, if the server upload rate is Us and the download rate of peer i is d_i?

Assume this is a client-server architecture.

A

The upload takes N*F/Us secs.
The minimum distribution time is at least F/d_min.

Then, we have Dcs = max{NF/Us, F/d_min}

38
Q

What is the distribution time for N peers needing a file of F size, if the server upload rate is Us and the download rate of peer i is d_i?

Assume this is a P2P architecture.

A

The server must upload a copy to the internet, so F/Us is a minimum.

The peer with the lowest download rate has F/dmin as a lower bound.

The total upload capacity is the upload of the server plus all the uploads of the peers. We then have NF bits to upload, and so the minimum distribution time is NF/(U_s + (u_1 + u_2 + … + u_n))

We take the max of these to get the rate.

39
Q

As the number of peers involved increases, what happens to the minimum distribution time of:
a) client-server
b) P2P

A

a) Grows linearly
b) Grows logarithmically.

40
Q

In BitTorrent, how does a new peer receive a file?

A

A random subset of peers is chosen by the tracker from the set of participating peers and sends the IP addresses to the new peer.

TCP connections are set up to as many of these are possible; and if they connect, we consider it a neighbouring peer - though these will fluctuate over time.

At any point, each peer has a different subset of chunks (each chunk is 256 kB). Periodically, the new peer will ask all neighbouring peers for the chunks they have.

The new peer then issues requests for chunks that are missing until the file is complete, with the rarest first.
This means the rarest chunks get quickly redistributed, aiming to equalize the numbers of copies of each chunk in the torrent.

41
Q

How does a peer decide how to send chunks in a torrent?

A

Priority goes to the four peers supplying them with the highest rate of chunks, and sends these four peers (termed unchoked) chunks.
Every 10 seconds, this is recalculated.

Additionally, every 30 seconds, a random peer is sent chunks (termed optimistically unchoked). Effectively, if the peers are both satisfied, then they become unchoked peers.

42
Q

How are videos stored in bits?

A

A sequence of images displayed at constant rate (usually 24 frames per sec). Each image is an array of pixels.
Bits are reduced spatially (per image) and temporal (storing only those that change).

On encoding, we can have CBR (constant bit rate), so the video encoding rate is fixed or VBR (variable bit rate) where the encoding rate changes as amount of coding changes.

43
Q

Why was DASH set up for video transfer?

A

To allow clients to all have different encodings that fit their bandwidth.

44
Q

How does DASH (Dynamic Adaptive Streaming over HTTP) work?

A

Several encodings of a video are stored, and when a client’s bandwidth is high, chunks from a high-rate version are selected and from a low-rate version when a client’s bandwidth is low.

The data about which file and versions exist is called a manifest file.

45
Q

What does a CDN (Content Distribution Network) allow for?

A

Widescale streaming of media.

46
Q

What is the Enter Deep philosophy for CDNs?

A

Deploy server clusters in access ISPs over the world to get close to end users. Maintaining and managing clusters becomes challenging.

47
Q

What is the Bring Home philosophy for CDNs?

A

Large clusters are built at a few IXPs. There is lower maintenance, but higher delay and throughput to end users.

48
Q

How does a client access a video at a CDN?

A
  1. The user clicks on the link to the video (say http://video.netcinema.com/6Y7B23V) and a DNS query is sent for video.netcinema.com.
  2. The local DNS relays this to an authoritative DNS server for NetCinema, which notices the ‘video’ in the hostname.
  3. The authoritative DNS server ‘hands over’ the query to a CDN by returning a hostname in KingCDN’s domain, say ‘a1105.kingcdn.com’.
  4. The local DNS then issues that request until an IP address of a content server is returned.
  5. The LDNS returns this to the client.
  6. A TCP connection is set up and a GET request issued - DASH may be used - and the video downloaded.
49
Q

How is the best cluster to direct to decided?

A

It could be geographical, but it is better to perform real-time measurements via probes to LDNS around the world (though some aren’t configured to receive them.)

50
Q

How does Netflix employ CDNs?

A

When a user selects a movie to play, the Netflix software running in the Amazon cloud decides which of its CDN servers is the best to deliver the content.
It sends the IP address to the client alongside a manifest file.
The CDN server and client then directly interact using DASH to get chunks of the movie.

[No DNS redirect is needed]