Peer to Peer File Sharing Flashcards

1
Q

Before the web, how did people share files? How did these deal with scalability issues and bandwidth demands?

A

FTP, Gopher servers and Bulletin board systems.
These run mirrors of their content, where material uploaded to the main archive was copied periodically to other servers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What was Napster used for? How did it cause problems?

A

sharing music files. Caused networks to be overloaded in college dorms for example

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How did Napster work?

A

Used the idea of an overlay network- a virtual network that hides the topology of the actual network. Instead represents the connections between nodes at an application level.

Central server acted as search engine but songs came from users machines.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What can be partially blamed for Napster’s downfall?

A

Its partially centralised architecture. The central server was an easy target for lawyers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How was Gnutella different from Napster?

A

A public protocol

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Describe the Gnutella network

A

Consists only of peers called servents. All nodes considered equal.

To join, a peer must have the address of a peer currently in the network.

Queries are flooded around the network, which is highly inefficient.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How did KaZaa address problems in Gnutella?

A

Targeted inefficiency. Introduced two types of nodes run by the same user on the one machine.

Two tier hierarchy: top level has only super nodes. Lower level ordinary nodes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Describe ordinary and super nodes in KaZaa

A

Ordinary nodes promoted to super nodes if they demonstrated enough bandwidth and uptime and can be downgraded.

Each ordinary node is associated with a super node that acts like its server. Now a peer requesting a file only has to contact its super node and if it is not available, it forwards the request only to super nodes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Which of Gnutella or KaZaa was proprietary

A

KaZaa

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Second Generation takes away the centralised server. What 2 difficulties does this cause?

A

Finding the thing in the first place- a launch point

Searching- no one bit knows all the answers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Bit Torrent is a mechanism for? How did it evolve?

A

Sharing large files efficiently by distributing it across a system

Idea -> protocol -> reference implementation -> open source software and protocol

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How does bitTorrent work?

A

Files are shared from an initial seed, uploading the file to an initial down loader. From there any down loaders also turn into seeds. This reduces the chance of network congestion by spreading the load

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Negative aspects of bit Torrent?

A

Slower for small files. Less convenient for inexperienced users.
File creator doesn’t have control over the file.
Download speeds depend on number of peers downloading, peers uploading and speeds of seeds.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a leecher (bit torrent)? How can this be avoided?

A

Intentionally throttle upload speeds to fully download a file then close of their torrent so they don’t help others.
Avoid: offer money for seeds

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is in a torrent descriptor? How does it make BitTorrent different?

A

A cryptographic hash that uniquely identifies the file being shared. Allows a client to ask the network whether any peers are hosting all or part of the file without caring about its content.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What was the problem of malware in 1st and 2nd gen? How does BitTorrent avoid this?

A

Anyone could participate.
Based only on file names, can be used as Trojan.

BitTorrent: matter of trust- do you trust the person who says what you want is through this node? There is the link of trust with a person. Also checksum- are the bytes what I expected?

17
Q

What does a torrent file contain?

A

Metadata describing the sizes of the chunks
A checksum for each
Tracker- a server that tracks currently active clients for a particular torrent

18
Q

Describe the BitTorrent choke/unchoke model

A

Keeps network usage under control.

A peer server up to 4 peers in its peer set at once. Seek out the fastest downloaders (seeds) or fastest uploaders (leech).

Choke: a peer temporarily refuses to upload to a peer. leeches choke all by the 4 fastest uploaders.

Every 30 seconds, a peer makes an optimistic unchoke and randomly unchokes a peer on the off chance it offers a better service.

19
Q

What approaches are there to searching a DS?

A
  1. Keep and index of everything centrally (Napster)
  2. Create a hierarchy of responsibility, root node contains everything and children loop up (DNS)
  3. Flood queries around the network of nodes until you rind it or exhaust every resource.
  4. Distributed Hash Table
20
Q

What is the idea of a hash table?

A

Minimise search time by providing a mechanism that allows you to take a good guess at where a value is stored

21
Q

How does the idea of a hash function extend to a DS?

A

Hash function- maps a key to a number that allow you to index a table to find the result.
In DS- Given a key we find which node has the data. Once we have a node we search its store

22
Q

Describe a keyspace

A

A set of numbers partitioned so that each participating node has responsibility for some part of that space.
Responsibility for looking after the data for a particular key is given to the node whose identifier equals or follows the hashed value of that key.

23
Q

Describe the compromise between a fully connected ring-based distributed hash table and and a circular one

A

Fully connected: we can get anywhere by a single hop. Heavyweight and volatile list of connections.
Circular: each node only connected to successor- have to cycle through all nodes.

Compromise- finger table. Four connections for each node gives an easy way of hopping round the system