PageRank Flashcards
What are some of the challenges of building a Web Search Engine?
- Lack of order and indexing.
- Many duplicated or similar pages.
- Huge number of pages.
- Pages aren’t classified.
Give a brief description of the PageRank algorithm…
An algorithm that displays web pages based on their authority.
What 2 principles underpin the PageRank algorithm?
Hyperlink Tricks
Random Surfer Model
What is the time complexity of PageRank?
O( MN^2 )
What is the formula for the PageRank Vector?
Pn+1 = H * Pn
What do we assign P0 on first iteration of PageRank?
Vector containing 1/N.
What are the 2 implementation issues PageRank have?
Sink Pages
Cycles
Define Sink Pages…
A page that has no outgoing links.
How can we identify a Sink Page?
If a column vector contains all 0’s.
What is the solution to a Sink Page problem?
Replace the column vector 0’s with 1/N.
Define the Cycle problem…
Linked pages that are in a cycle, and lead to an infinite increase of authority for the pages.
What is the solution to the Cycle Problem?
Random Surfer Model.
What is the Random Surfer Model?
A solution to Cycles. It’s based on the assumption that a surfer has a slight chance of starting a new session as opposed to following a link. This probability is referred to the Dampening Factor.
What is the Dampening Factor?
The probability that a browser stays on the website. Traditionally 0.85.
What is the formula for the Google Matrix?
G = d * A + ( 1 - d ) * B