Chapter 12: P2P Applications Flashcards
1
Q
Spotify.com, 2004
A
- Commercially deployed system, KTU start-up (Sweden)
- Peer-assisted on-demand music streaming
- Legal and licensed content only
- Large catalogue of music (over 15 million tracks)
- Available in U.S. & 7 European countries, over 10 million users, 1.6 million subscribers
- Fast (median playback latency of 265 ms)
- Proprietary client software for desktop & phone (not p2p)
- Business model: Ad-funded and free & monthly subscription, no ads, premium content, higher quality streaming
2
Q
Overview of Spotify Protocol
A
- Proprietary protocol
- Designed for on-demand music streaming (share to peers)
- Only Spotify can add tracks
- 96–320 kbps audio streams (most are Ogg Vorbis q5, 160 kbps)
- Relatively simple and straightforward design

3
Q
Why a peer-to-peer protocol? Spotify
A
- Improve scalability of service
- Decrease load on servers and network resources
- Explicit design goal:
- Use of peer-to-peer should not decrease overall performance (i.e., playback latency & stutter)
4
Q
Spotify P2P overlay structure
A
- Nodes have fixed maximum degree (60)
- Neighbour eviction by heuristic evaluation of utility
- ( multi objective [Pound , Upload ( opacity] ) ks Utility function
- Limited overlay routing (search requests)
- Looks for and connects to new peers when streaming new track
- Overlay becomes (weakly) clustered by interest
- Client only downloads data user needs
5
Q
Finding peers Spotify
A
-
Sever-side tracker (cf. BitTorrent)
- Only remembers 20 peers per track
- Returns 10 (online) peers to client on query
- Clients broadcast query in small (2 hops) neighbourhood in overlay (cf. Gnutella)
- Client uses both mechanisms for every track
6
Q
Spotify Protocol
A
- (Almost) everything encrypted
- (Almost) everything over TCP
- Persistent connection to server while logged in
- Multiplex messages over a single TCP connection
7
Q
Spotify Caches
A
- Client (player) caches tracks it has played
- Default policy is to use 10% of free space (capped at 10 GB)
- Caches are often larger (56% are over 5 GB)
- Least Recently Used policy for cache eviction hk
- Over 50% of data comes from local cache
- Cached files are served in peer-to-peer overlay (if track completely downloaded)
8
Q
Spotify - Streaming a Track
A
- Tracks are decomposed into 16 kB chunks
- Request first piece of track from Spotify servers
- Meanwhile, search for peers that cache track
- Download data in-order (chunk by chunk via TCP)
- Ask for most urgent pieces first
- Towards end of a track, start prefetching next track
- If a peer is slow, re-request data from new peers
- If local buffer is sufficiently filled, only download from peer-to-peer overlay play
- If buffer is getting low, stop uploading
9
Q
Spotify Security through obscurity
A
- Music data lies encrypted in caches
- Client must be able to access music data
- Reverse engineers should not be able to access music data
- Details are secret and client code is obfuscated
- Security through obscurity is a bad idea
10
Q
Spotify Key points
A
- Simplicity of architecture, protocol, design
- Peer-assisted, i.e., rely on centralized server
- Use of peer-to-peer techniques for scalability and avoid heavy, over-provisioned infrastructure
- Use of centralized tracker
11
Q
BitTorrent High Level
A
-
Pull-based, swarming approach
- Each file is split into smaller pieces (& sub-pieces)
- Peers request desired pieces from neighboring peers
- Pieces are not downloaded in sequential order
- Encourages contribution by all peers
- Based on a tit-for-tat model
12
Q
BitTorrent use cases
A
- File-sharing
- What uses does BitTorrent support?
- Downloading (licensed only ) movies, music, etc.
- Update distribution among servers at Facebook et al.
- Distribution of updates and releases (e.g., World of War.)
- •…
13
Q
BitTorrent Swarm
A
- Swarm
- Set of peers downloading the same file
- Organized as a randomly connected mesh of peers
- Each peer knows list of pieces downloaded by neighboring peers
- Peer requests pieces it does not own from neighbors
14
Q
BitTorrent terminology
A
-
Seed: Peer with the entire file
- Original Seed: First seed for a file
-
Leech: Peer downloading the file
- Leech becomes a seed, once file downloaded, if the peer stays online & continues by protocol
-
Sub-piece: Further subdivision of a piece
- “Unit for requests” is a sub-piece (16 kB)
- Peer uploads piece only after assembling it completely
15
Q
How a node enters a swarm for file “popeye.mp4” Bitorrent
A
- File popeye.mp4.torrent hosted at a (well- known) web server
- The .torrent has address of tracker for file
- The tracker, which runs on a web server as well, keeps track of all peers downloading file

16
Q
Contents of .torrent file
A
- URL of tracker
- Piece length – Usually 256 KB
- SHA-1 hashes of each piece in file - Why?
- Hash helps to check if file downloaded is the right one
17
Q
Bitorrent 3 Phases
A
- Beginning: First piece
- First, pick a random piece (random first policy)
- When peer starts to download, request random piece
- When first complete piece assembled, switch to rarest- first
- Get first piece to quickly participate in swarm with upload
- Rare pieces are only available at few peers (slows
- First, pick a random piece (random first policy)
- Middle: Piece 1 to n-1
-
Strict priority policy
- Always complete download of piece (sub-pieces) before starting a new piece
- Getting a complete piece as quickly as possible
-
Strict priority policy
- End-game: Piece n
- When requests sent for all sub-pieces of last
piece, re-send requests to all peers
* **Upon** **download** of **entire** **piece**, **cancel** **request** for downloading **sub**-**pieces** **from** **other** **peers** * **Speeds** **up** **completion** of download, **otherwise last piece could delay completion of download**
18
Q
Tit-for-tat as incentive to upload Bitorrent
A
- Want to encourage all peers to contribute
- Peer A said to choke peer B if A decides not to upload to B
- Each peer (say A) unchokes at most 4 interested peers at any time
- The three with the largest download rates from A
- Where the tit-for-tat comes in
-
Another randomly chosen (optimistic unchoke)
- Periodically discover better choices
- The three with the largest download rates from A
19
Q
Why BitTorrent took off
A
- Working implementation (Bram Cohen) with simple well- defined interfaces for publishing content
- Open specification
- Many competitors got sued & shut down (Napster, KaZaA)
-
Simple approach
- Doesn’t do “search” per se
- Users use well-known, trusted sources to locate content
20
Q
Pros & cons of BitTorrent
A
Pros:
- Proficient in utilizing partially downloaded files
-
Discourages“freeloading”
- By rewarding fastest uploaders
-
Encourages diversity through “rarest-first”
- Extends lifetime of swarm
- Works well for “hot content”
Cons:
- Assumes all interested peers active at same time
- Performance deteriorates if swarm “cools off”
- Too much overhead to disseminate small files
21
Q
Spotify vs. BitTorrent
A
Spotify:
- Live listening of streamed track
- One peer-to-peer overlay for all tracks
- Does not inform peers about downloaded blocks
- Downloads blocks in order
- Does not enforce fairness
- Informs peers about urgency of request
- Supports search
Bitorrent:
- Batch download of large files
- Essentially, a swarm (overlay) per torrent
- Exchange of downloaded blocks with peers
- Random download order
- Tit-for-tat (game theoretic roots)
- No notion of urgency
- No search feature
22
Q
Skype network
A
- Super Nodes: Any node with a public IP address having sufficient CPU, memory and network bandwidth is candidate to become a super node C know who is online)
- Ordinary Host: Host needs to connect to super node and must register itself with the Skype login server
- Login server and PSTN gateway (not shown) are only centralized components
23
Q
Impact of Skype
A
- Skype has shown, at least has suggested, the following
- Signaling, the most unique property of traditional phone systems, can now be accomplished effortlessly with self-organizing P2P networks
- P2P overlay networks can scale up to handle large- scale connection-oriented real-time services such as voice
24
Q
Common principles
A
- Peer-assisted, hybrid architecture
- Use of tracker, as directory (centralized)
- Peer-to-peer capability to limit infrastructure investments & achieve scalability
- Use of “heavier” TCP, due to inherent guarantees
- Simplicity of design, as simple as possible