8 - Content Distribution Flashcards
An internet wide tool that enables websites and network operators to deliver data quickly and efficiently
Content Distribution
HTTP (Hypertext Transfer Protocol)
An application layer protocol to transfer web content. Layered over a byte stream protocoal (almost always TCP)
Contents of HTTP request
*Requestline: Method-> GET, POST, HEAD
URL -> /index.html
version number
*Headers: Referrer-> what url caused the page to be requested
user agent -> client software being used to fetch the page
Which HTTP header field indicates client software making request
The user-agent field.
HTTP Response includes…
A Status line: HTTP Version
Response code
Location: used for redirection
Server: indicates server software
Allow: http methods allowed (get, head, etc)
Content-encoding: compressed or not
content-length: how long is content is in bytes
expires: how long can it be cached
last-modified: When was page last modified
Response codes
100 - informational 200 - success 300 - redirection 301 - moved permanently 304 - not modified response 400 - errors 404 - not found 500 - server errors
Early HTTP
Simple to implement but needed a TCP connection for every request. Solution? Persistent connections
Multiple request/response are multiplexed on a single TCP connection
Persistent Connections
A client sends a request as soon as it encounters a referenced object
Pipelining
Persistent connection with pipelining is the
default behavior in HTTP 1.1
Clients can cache documents in …
browser(local) in network (i.e. content distribution, networks)
Clients can be directed to a cache in multiple ways
1) Browser configuration
2) Server directed
Benefits of HTTP caching
1) Reduced transit costs for local ISP
2) Improved performance for local clients
What is a CDN (Content Distribution Networks)
#An Overlay network of web caches designed to deliver content to a client from the optimal location. *Geographically disperate servers
Who owns CDNs?
- Content providers
* Network/ISPs
Main goal of a CDN?
To replicate content on many servers
CDN problem: Server Selection
Which server? Lowest latency
CDN problem: Content routing
How to route? 1) Routing (i.e anycast) coarse
2) Application-based (i.e. HTTP redirect) delays 3) Naming-based (i.e DNS) (best way) flexible and fast
CDNs peer with ISPS because
1) where a customer is located, better throughput (lower latency)
2) Increased reliability and redundancy
3) burstiness -> lower transit costs
ISPs per with CDNs because
1) Good performance for customers when close (i.e. Ga tech has a google cache node in its own network resulting in low latency for customers connecting to google.. making the customers happy)
2) lower transit costs (hosting a local cache node reduces cost)
Why do ISPs want to peer with CDNs?
1) Lower transit cost
2) Provide better performance
Peer to peer content distribution network
Bit Torrent (file sharing, large file distribution) Fetch the content from peers instead of one place.)
clients swap pieces of the file
Bit Torrent steps for publishing
1) Peer creates a “torrent” which has tracker and pieces of file
2) “Seeders” creates initial copy
3) “Leechers” clients that contain incomplete copies of the files.
Bit Torrent solved this problem:
“Freeloading”… called “Choking” or “tit-for-tat”