PageRank Algorithm¨ Flashcards

1
Q

What is the PageRank algorithm used for? How is it performed?

A

It’s used to determine the importance of a webpage. It uses a random surfer model where a visitor start their session at random pages and walk along links that all have the same chance and the choices are made uniformly. This can be modeled by a Markov process with iterative calculation of a transition matrix. The solution is then the principal eigen vector of the transition matrix.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How much harddisk is needed in pagerank?

A

About 10xRAM

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is teleporting?

A

The method of letting the random surfer have a small probability of teleporting to a random page to prevent spider traps for example.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are spider traps? How can they be prevented?

A

Spider traps are dead ends where the only transition probability is itself so the solution becomes for example [0,0,1,0]. Spider traps are sets of nodes which pools the probabilities to themselves. Can be prevented via teleporting.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are dead-ends? How can they be prevented?

A

Nodes without a chance to transition away from them. So the solution becomes for example [0,0,0,0]. Can be prevented by removing the dead end recursively.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How much memory does a node need in pagerank?

A

32 bits/4 byte

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How can the sparse matrix of pagerank be encoded to ensure less memory?

A

Only encode the nonzero entries of the matrix.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly