PageRank Algorithm

Sneiderman, Robby

ML Methods

PageRank Algorithm

PageRank as the stationary distribution of a random walk on a graph: damping factor, power iteration, eigenvector interpretation, and applications beyond web search.

CoreTier 2StableSupporting~45 min

Prerequisites

Eigenvalues and Eigenvectors Graph Algorithms Essentials

Prereq Map

Learning position

Read this page in the graph.

ml-methods | layer 2 | tier 2. This page has 2 direct prerequisites and 3 published dependents.

Open Atlas Prerequisites Leads to

What next

Spectral Clustering

This is the first curated or graph-derived continuation from the current page.

Evidence badge

Claim status

This page has no public Lean mapping yet. Use the evidence page to inspect how claim status labels work.

Show the backing system

AtlasOpen the full prerequisite graph and run grounding traces.EvidenceInspect source support, claim labels, and public trust status.LeanReview the checked declaration list, scopes, and axiom profile.

Why This Matters

PageRank answers a simple but powerful question: given a graph, which nodes are most important? The answer. define importance as the stationary distribution of a random walk. turns out to be one of the most widely used ideas in machine learning and data science, far beyond its original application to web search.

PageRank appears in citation analysis (which papers are most influential), social networks (who are the key influencers), knowledge graphs (which entities are most central), recommendation systems (which items are most relevant), and even in understanding neural network architectures.

Mental Model

Imagine a person randomly browsing the web. At each page, they either click a random outgoing link (with probability $\alpha$ ) or jump to a completely random page (with probability $1 - \alpha$ ). After a very long time, the fraction of time spent on each page converges to a fixed distribution. Pages where the random surfer spends more time are "more important."

PageRank is this limiting distribution.

Formal Setup

Let $G = (V, E)$ be a directed graph with $n = |V|$ nodes. Let $A$ be the adjacency matrix where $A_{ij} = 1$ if there is an edge from $j$ to $i$ (note: column $j$ lists $j$ 's outgoing links).

Definition

Transition Matrix

The transition matrix $M$ of the random walk on $G$ is:

$M_{ij} = \frac{A_{ij}}{d_j^{\text{out}}}$

where $d_j^{\text{out}} = \sum_i A_{ij}$ is the out-degree of node $j$ . Column $j$ of $M$ is a probability distribution over the neighbors of $j$ . For dangling nodes (nodes with no outgoing links), define the column as $1/n$ for all entries (uniform random jump).

Definition

PageRank Vector

The PageRank vector $\pi$ is defined as the stationary distribution of the modified random walk:

$\pi = \alpha M \pi + (1 - \alpha) \frac{\mathbf{1}}{n}$

where $\alpha \in (0, 1)$ is the damping factor (typically $\alpha = 0.85$ ) and $\mathbf{1}/n$ is the uniform distribution over all nodes.

Equivalently, $\pi$ is the solution to:

$\pi = \left(\alpha M + \frac{1 - \alpha}{n} \mathbf{1}\mathbf{1}^\top\right) \pi$

The matrix $\tilde{M} = \alpha M + \frac{1-\alpha}{n} \mathbf{1}\mathbf{1}^\top$ is the Google matrix. It represents a random walk where at each step, with probability $\alpha$ you follow a random outgoing link and with probability $1 - \alpha$ you teleport to a uniformly random node.

Main Theorems

Theorem

Existence and Uniqueness of PageRank

Statement

The PageRank vector $\pi$ exists, is unique, and has all positive entries. Specifically, $\pi$ is the unique probability vector satisfying $\tilde{M} \pi = \pi$ , where $\tilde{M} = \alpha M + \frac{1-\alpha}{n}\mathbf{1}\mathbf{1}^\top$ .

Intuition

The teleportation term $(1-\alpha)/n$ ensures that every node can reach every other node (the Markov chain is irreducible) and that the chain is aperiodic. By the Perron-Frobenius theorem, a positive stochastic matrix has a unique stationary distribution with all positive entries. Without teleportation, the random walk could get stuck in cycles or sinks, and the stationary distribution might not be unique.

Proof Sketch

The Google matrix $\tilde{M}$ is a column-stochastic matrix (columns sum to 1) with all entries strictly positive (each entry is at least $(1-\alpha)/n > 0$ ). By the Perron-Frobenius theorem for positive matrices, $\tilde{M}$ has a unique eigenvalue of magnitude 1 (which is 1 itself), and the corresponding eigenvector has all positive entries. Normalizing this eigenvector to sum to 1 gives the unique stationary distribution $\pi$ .

Why It Matters

Existence and uniqueness mean that PageRank is well-defined for any graph. The positivity of all entries means every node gets some PageRank, which is important for practical applications where you do not want nodes to have zero importance.

Failure Mode

If $\alpha = 1$ (no teleportation), uniqueness can fail. The random walk on a graph with multiple strongly connected components may have multiple stationary distributions. The damping factor is not just a tuning parameter. It is mathematically necessary for uniqueness.

report a correction →

Computing PageRank: Power Iteration

Theorem

Power Iteration Convergence for PageRank

Statement

The power iteration $\pi_{t+1} = \tilde{M} \pi_t$ converges to the PageRank vector $\pi$ from any starting distribution $\pi_0$ . The convergence rate is geometric:

$\|\pi_t - \pi\|_1 \leq 2\alpha^t$

After $t = O(\log(1/\epsilon) / \log(1/\alpha))$ iterations, the error is at most $\epsilon$ .

Intuition

At each step, the random surfer's distribution gets "mixed" by the teleportation. The teleportation mixes at rate $1 - \alpha$ per step, so after $t$ steps, the unmixed fraction is $\alpha^t$ . For $\alpha = 0.85$ , you need about 50 iterations to reduce the error by a factor of $10^3$ .

Proof Sketch

The second-largest eigenvalue of $\tilde{M}$ in absolute value is at most $\alpha$ (the teleportation "shrinks" all eigenvalues except the dominant one toward zero by a factor of $\alpha$ ). The error after $t$ steps is bounded by $\|\pi_t - \pi\|_1 \leq c \cdot |\lambda_2|^t \leq c \cdot \alpha^t$ . More precisely, write $\pi_0 - \pi = \sum_{k \geq 2} c_k v_k$ in the eigenbasis. After $t$ multiplications, $\pi_t - \pi = \sum_{k \geq 2} c_k \lambda_k^t v_k$ , and each $|\lambda_k| \leq \alpha$ .

Why It Matters

Power iteration makes PageRank computation practical even for graphs with billions of nodes. Each iteration is a sparse matrix-vector multiplication (cost proportional to the number of edges), and convergence is fast. This is why Google could compute PageRank for the entire web.

Failure Mode

Convergence slows as $\alpha \to 1$ . For $\alpha = 0.99$ , you need about 10 times as many iterations as for $\alpha = 0.85$ . There is a trade-off: larger $\alpha$ gives a PageRank that more faithfully reflects the link structure, but takes longer to compute and is more sensitive to graph structure.

report a correction →

The Eigenvector Interpretation

PageRank is the dominant eigenvector of the Google matrix $\tilde{M}$ . This connects PageRank to spectral graph theory.

The eigenvalue equation $\tilde{M}\pi = \pi$ says that $\pi$ is an eigenvector with eigenvalue 1. Since $\tilde{M}$ is a positive stochastic matrix, 1 is the largest eigenvalue (by Perron-Frobenius), and $\pi$ is the corresponding eigenvector.

This means PageRank can also be understood as the solution to an eigenvalue problem: find the eigenvector of the modified adjacency matrix with the largest eigenvalue. This connects to the broader theory of spectral methods on graphs.

The Role of the Damping Factor

The damping factor $\alpha$ controls the balance between two forces:

Link following ( $\alpha$ close to 1): PageRank heavily reflects the link structure. Important nodes are those that receive links from other important nodes (recursive definition). Risk: sensitive to graph structure, susceptible to link spam.
Teleportation ( $\alpha$ close to 0): PageRank approaches the uniform distribution. Every node is roughly equally important. Safe but uninformative.

The standard choice $\alpha = 0.85$ means that at each step, the random surfer follows a link 85% of the time and teleports 15% of the time. This was found empirically to give good results for web search.

Applications Beyond Web Search

Citation networks: Nodes are papers, edges are citations. PageRank identifies influential papers that are cited by other influential papers. Unlike simple citation counts, PageRank weights citations by the importance of the citing paper.

Social networks: Nodes are users, edges are follow/friend relationships. PageRank identifies influential users. Variants like Personalized PageRank (replace uniform teleportation with teleportation biased toward a specific user) power recommendation systems.

Knowledge graphs: Nodes are entities, edges are relations. PageRank helps identify central entities for question answering and information retrieval.

Recommendation systems: Random walks on user-item graphs, where the stationary distribution (or a personalized variant) ranks items by relevance to a user.

Common Confusions

Watch Out

PageRank is not just counting incoming links

A node with many low-quality links can have lower PageRank than a node with few high-quality links. PageRank is recursive: a link from an important node transfers more importance than a link from an unimportant node. This is the key insight that made PageRank work for web search.

Watch Out

The damping factor is not optional

Without damping ( $\alpha = 1$ ), the random walk can get trapped in sinks (nodes with no outgoing links) or cycle between disconnected components. The stationary distribution may not be unique. The damping factor is not a regularization hack. It is a mathematical necessity for well-definedness.

Watch Out

PageRank and eigenvector centrality are not identical

Eigenvector centrality is the dominant eigenvector of the adjacency matrix $A$ . PageRank is the dominant eigenvector of the Google matrix $\tilde{M}$ , which normalizes by out-degree and adds teleportation. PageRank is a refined version of eigenvector centrality that handles directed graphs, dangling nodes, and disconnected components.

Summary

PageRank is the stationary distribution of a random walk with teleportation
The Google matrix is $\tilde{M} = \alpha M + (1-\alpha)\mathbf{1}\mathbf{1}^\top/n$
Damping factor $\alpha = 0.85$ controls link-following vs teleportation
Existence and uniqueness follow from Perron-Frobenius (positive stochastic matrix)
Power iteration converges geometrically at rate $\alpha$
PageRank = dominant eigenvector of the Google matrix
Applications: citation networks, social networks, knowledge graphs, recommendation systems

Exercises

ExerciseCore

Problem

Consider a 3-node graph: node 1 links to nodes 2 and 3, node 2 links to node 3, and node 3 links to node 1. Write down the transition matrix $M$ and the Google matrix $\tilde{M}$ with $\alpha = 0.85$ .

ExerciseAdvanced

Problem

Prove that if $\alpha = 1$ (no teleportation), the PageRank may not be unique. Give an example of a graph where the random walk has multiple stationary distributions.

ExerciseAdvanced

Problem

How many power iterations are needed to compute PageRank to within $L_1$ error $\epsilon = 10^{-6}$ when $\alpha = 0.85$ ? What about $\alpha = 0.99$ ?

References

Canonical:

Page, Brin, Motwani, Winograd, "The PageRank Citation Ranking: Bringing Order to the Web" (1999)
Langville & Meyer, Google's PageRank and Beyond (2006)

Current:

Gleich, "PageRank Beyond the Web" (SIAM Review, 2015)
Hamilton, Graph Representation Learning (2020), Chapter 2

Next Topics

The natural next steps from PageRank:

Spectral clustering: using eigenvectors of graph matrices for clustering
Graph neural networks: generalizing PageRank-style message passing with learned functions
Markov chains: the full theory of random walks and stationary distributions

Last reviewed: April 13, 2026

Canonical graph

Required before and derived from this topic

These links come from prerequisite edges in the curriculum graph. Editorial suggestions are shown here only when the target page also cites this page as a prerequisite.

Full prerequisite chain All derived topics

Required prerequisites

2

Eigenvalues and Eigenvectorslayer 0A · tier 1
Graph Algorithms Essentialslayer 0A · tier 2

Derived topics

3

Markov Chains and Steady Statelayer 1 · tier 2
Spectral Clusteringlayer 2 · tier 2
Graph Neural Networkslayer 3 · tier 2

Graph-backed continuations

Spectral Clustering Graph Neural Networks Markov Chains and Steady State