ML Methods
PageRank Algorithm
PageRank as the stationary distribution of a random walk on a graph: damping factor, power iteration, eigenvector interpretation, and applications beyond web search.
Prerequisites
Why This Matters
PageRank answers a simple but powerful question: given a graph, which nodes are most important? The answer. define importance as the stationary distribution of a random walk. turns out to be one of the most widely used ideas in machine learning and data science, far beyond its original application to web search.
PageRank appears in citation analysis (which papers are most influential), social networks (who are the key influencers), knowledge graphs (which entities are most central), recommendation systems (which items are most relevant), and even in understanding neural network architectures.
Mental Model
Imagine a person randomly browsing the web. At each page, they either click a random outgoing link (with probability ) or jump to a completely random page (with probability ). After a very long time, the fraction of time spent on each page converges to a fixed distribution. Pages where the random surfer spends more time are "more important."
PageRank is this limiting distribution.
Formal Setup
Let be a directed graph with nodes. Let be the adjacency matrix where if there is an edge from to (note: column lists 's outgoing links).
Transition Matrix
The transition matrix of the random walk on is:
where is the out-degree of node . Column of is a probability distribution over the neighbors of . For dangling nodes (nodes with no outgoing links), define the column as for all entries (uniform random jump).
PageRank Vector
The PageRank vector is defined as the stationary distribution of the modified random walk:
where is the damping factor (typically ) and is the uniform distribution over all nodes.
Equivalently, is the solution to:
The matrix is the Google matrix. It represents a random walk where at each step, with probability you follow a random outgoing link and with probability you teleport to a uniformly random node.
Main Theorems
Existence and Uniqueness of PageRank
Statement
The PageRank vector exists, is unique, and has all positive entries. Specifically, is the unique probability vector satisfying , where .
Intuition
The teleportation term ensures that every node can reach every other node (the Markov chain is irreducible) and that the chain is aperiodic. By the Perron-Frobenius theorem, a positive stochastic matrix has a unique stationary distribution with all positive entries. Without teleportation, the random walk could get stuck in cycles or sinks, and the stationary distribution might not be unique.
Proof Sketch
The Google matrix is a column-stochastic matrix (columns sum to 1) with all entries strictly positive (each entry is at least ). By the Perron-Frobenius theorem for positive matrices, has a unique eigenvalue of magnitude 1 (which is 1 itself), and the corresponding eigenvector has all positive entries. Normalizing this eigenvector to sum to 1 gives the unique stationary distribution .
Why It Matters
Existence and uniqueness mean that PageRank is well-defined for any graph. The positivity of all entries means every node gets some PageRank, which is important for practical applications where you do not want nodes to have zero importance.
Failure Mode
If (no teleportation), uniqueness can fail. The random walk on a graph with multiple strongly connected components may have multiple stationary distributions. The damping factor is not just a tuning parameter. It is mathematically necessary for uniqueness.
Computing PageRank: Power Iteration
Power Iteration Convergence for PageRank
Statement
The power iteration converges to the PageRank vector from any starting distribution . The convergence rate is geometric:
After iterations, the error is at most .
Intuition
At each step, the random surfer's distribution gets "mixed" by the teleportation. The teleportation mixes at rate per step, so after steps, the unmixed fraction is . For , you need about 50 iterations to reduce the error by a factor of .
Proof Sketch
The second-largest eigenvalue of in absolute value is at most (the teleportation "shrinks" all eigenvalues except the dominant one toward zero by a factor of ). The error after steps is bounded by . More precisely, write in the eigenbasis. After multiplications, , and each .
Why It Matters
Power iteration makes PageRank computation practical even for graphs with billions of nodes. Each iteration is a sparse matrix-vector multiplication (cost proportional to the number of edges), and convergence is fast. This is why Google could compute PageRank for the entire web.
Failure Mode
Convergence slows as . For , you need about 10 times as many iterations as for . There is a trade-off: larger gives a PageRank that more faithfully reflects the link structure, but takes longer to compute and is more sensitive to graph structure.
The Eigenvector Interpretation
PageRank is the dominant eigenvector of the Google matrix . This connects PageRank to spectral graph theory.
The eigenvalue equation says that is an eigenvector with eigenvalue 1. Since is a positive stochastic matrix, 1 is the largest eigenvalue (by Perron-Frobenius), and is the corresponding eigenvector.
This means PageRank can also be understood as the solution to an eigenvalue problem: find the eigenvector of the modified adjacency matrix with the largest eigenvalue. This connects to the broader theory of spectral methods on graphs.
The Role of the Damping Factor
The damping factor controls the balance between two forces:
-
Link following ( close to 1): PageRank heavily reflects the link structure. Important nodes are those that receive links from other important nodes (recursive definition). Risk: sensitive to graph structure, susceptible to link spam.
-
Teleportation ( close to 0): PageRank approaches the uniform distribution. Every node is roughly equally important. Safe but uninformative.
The standard choice means that at each step, the random surfer follows a link 85% of the time and teleports 15% of the time. This was found empirically to give good results for web search.
Applications Beyond Web Search
Citation networks: Nodes are papers, edges are citations. PageRank identifies influential papers that are cited by other influential papers. Unlike simple citation counts, PageRank weights citations by the importance of the citing paper.
Social networks: Nodes are users, edges are follow/friend relationships. PageRank identifies influential users. Variants like Personalized PageRank (replace uniform teleportation with teleportation biased toward a specific user) power recommendation systems.
Knowledge graphs: Nodes are entities, edges are relations. PageRank helps identify central entities for question answering and information retrieval.
Recommendation systems: Random walks on user-item graphs, where the stationary distribution (or a personalized variant) ranks items by relevance to a user.
Common Confusions
PageRank is not just counting incoming links
A node with many low-quality links can have lower PageRank than a node with few high-quality links. PageRank is recursive: a link from an important node transfers more importance than a link from an unimportant node. This is the key insight that made PageRank work for web search.
The damping factor is not optional
Without damping (), the random walk can get trapped in sinks (nodes with no outgoing links) or cycle between disconnected components. The stationary distribution may not be unique. The damping factor is not a regularization hack. It is a mathematical necessity for well-definedness.
PageRank and eigenvector centrality are not identical
Eigenvector centrality is the dominant eigenvector of the adjacency matrix . PageRank is the dominant eigenvector of the Google matrix , which normalizes by out-degree and adds teleportation. PageRank is a refined version of eigenvector centrality that handles directed graphs, dangling nodes, and disconnected components.
Summary
- PageRank is the stationary distribution of a random walk with teleportation
- The Google matrix is
- Damping factor controls link-following vs teleportation
- Existence and uniqueness follow from Perron-Frobenius (positive stochastic matrix)
- Power iteration converges geometrically at rate
- PageRank = dominant eigenvector of the Google matrix
- Applications: citation networks, social networks, knowledge graphs, recommendation systems
Exercises
Problem
Consider a 3-node graph: node 1 links to nodes 2 and 3, node 2 links to node 3, and node 3 links to node 1. Write down the transition matrix and the Google matrix with .
Problem
Prove that if (no teleportation), the PageRank may not be unique. Give an example of a graph where the random walk has multiple stationary distributions.
Problem
How many power iterations are needed to compute PageRank to within error when ? What about ?
References
Canonical:
- Page, Brin, Motwani, Winograd, "The PageRank Citation Ranking: Bringing Order to the Web" (1999)
- Langville & Meyer, Google's PageRank and Beyond (2006)
Current:
-
Gleich, "PageRank Beyond the Web" (SIAM Review, 2015)
-
Hamilton, Graph Representation Learning (2020), Chapter 2
-
Bishop, Pattern Recognition and Machine Learning (2006), Chapters 1-14
Next Topics
The natural next steps from PageRank:
- Spectral clustering: using eigenvectors of graph matrices for clustering
- Graph neural networks: generalizing PageRank-style message passing with learned functions
- Markov chains: the full theory of random walks and stationary distributions
Last reviewed: April 2026
Prerequisites
Foundations this topic depends on.
- Eigenvalues and EigenvectorsLayer 0A
- Matrix Operations and PropertiesLayer 0A
- Sets, Functions, and RelationsLayer 0A
- Basic Logic and Proof TechniquesLayer 0A