Week 1 Flashcards

1
Q

What are some of the challenges in building a web search engine?

A

A large number of web pages
Web pages are not ordered or indexed
No classification of web pages

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the two principles of the PageRank algorithm?

A

Hyperlink tricks
Random surfer model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the hyperlink trick?

A

Pages can be ranked by the number of links from which this page is linked to

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the matrix form of pagerank?

A

P_n+1 = H x P_n

Where H is the transition matrix

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the two issues in PageRank?

A

Sink Pages
Cycle Pages

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How do we solve the sink page problem?

A

Replace the column of all 0s with 1/N in the P matrix

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the cycle problem?

A

Few webpages may be linked to a closed cycle, leading to an infinite increase of the authority score

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the random surfer model?

A

A surfer will randomly click on a link (accessing the site directly) or click a random hyperlink on the website.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What does the damping factor do?

A

It’s a chance that a user will stop clicking links and get bored.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the complexity of PageRank?

A

O(MN^2)
M = number of iterations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly