Martingale Theory

Sneiderman, Robby

Mathematical Infrastructure

Martingale Theory

Martingales and their convergence properties: Doob martingale, optional stopping theorem, martingale convergence, Azuma-Hoeffding inequality, and Freedman inequality. The tools behind McDiarmid's inequality, online learning regret bounds, and stochastic approximation.

AdvancedTier 2StableCore spine~70 min

Prerequisites

Measure Theoretic Probability

Start 8-question practice · 26 available 3-question pulse check Prereq Map

Learning position

Read this page in the graph.

mathematical-infrastructure | layer 0B | tier 2. This page has 1 direct prerequisite and 9 published dependents.

Open Atlas Prerequisites Leads to

What next

McDiarmid's Inequality

This is the first curated or graph-derived continuation from the current page.

Evidence badge

2 Lean-backed claims

1 match the public claim scope; 1 are dependency proofs. 0 sorry/admit markers are recorded.

Show the backing system

AtlasOpen the full prerequisite graph and run grounding traces.EvidenceInspect source support, claim labels, and public trust status.LeanReview the checked declaration list, scopes, and axiom profile.

Why This Matters

Martingales are the mathematical framework for "fair games" --- sequences of random variables where the conditional expectation of the next value, given the past, equals the current value. This simple definition forces strong concentration and convergence properties (Doob, Azuma, optional stopping) that the iid case alone does not deliver.

In machine learning theory, martingales are everywhere:

McDiarmid's inequality (the key concentration tool for generalization bounds) is proved using the Azuma-Hoeffding inequality for martingales
Online learning regret bounds use martingale arguments to control the cumulative loss of adaptive algorithms
Stochastic approximation (SGD convergence theory) uses martingale convergence theorems to show that noisy gradient estimates converge
Sequential hypothesis testing uses optional stopping to determine when to stop collecting data
PAC-Bayes bounds with data-dependent priors use martingale constructions to handle adaptivity

Martingales are the probabilistic structure that lets you extend concentration-of-measure arguments from independent sequences to adapted sequences with conditional centering. They generalize sums of independent variables while retaining enough structure for tail bounds.

theorem visual

Filtration Reveal

$As information is revealed, the conditional expectation updates fairly; bounded increments then control how far the process can drift.$

Reveal step

step = 3

F0

reveal 0

F1

reveal 1

F2

reveal 2

F3

reveal 3

F4

reveal 4

Doob interpretation

Each step is the best estimate of the final outcome after revealing one more piece of information.

Martingale identity

Z_{i} = E [f (X) ∣ X_{1}, \dots, X_{i}]

Mental Model

Think of a gambler playing a sequence of fair games. After each round, the expected value of their fortune equals their current fortune --- no strategy can create an expected gain from a fair game. This is the martingale property.

The power comes from what you can prove about such sequences: they cannot stray too far from their starting point (concentration), they converge to a limit (convergence theorem), and stopping at a clever time cannot create an expected profit (optional stopping).

Formal Setup

Definition

Filtration ${F_{n}}$

A filtration is an increasing sequence of sigma-algebras:

$\mathcal{F}_0 \subseteq \mathcal{F}_1 \subseteq \mathcal{F}_2 \subseteq \cdots$

$\mathcal{F}_n$ represents the "information available at time $n$ ." A random variable $X$ is $\mathcal{F}_n$ -measurable if and only if its value is determined by the information in $\mathcal{F}_n$ . The natural filtration of a sequence $X_0, X_1, \ldots$ is $\mathcal{F}_n = \sigma(X_0, X_1, \ldots, X_n)$ --- the sigma-algebra generated by the first $n+1$ observations.

Definition

Martingale $(M_{n}, F_{n})$

A sequence of random variables $(M_n)_{n \geq 0}$ is a martingale with respect to filtration $(\mathcal{F}_n)_{n \geq 0}$ if and only if:

$M_n$ is $\mathcal{F}_n$ -measurable (adapted) for all $n$
$\mathbb{E}[|M_n|] < \infty$ for all $n$ (integrability)
$\mathbb{E}[M_{n+1} \mid \mathcal{F}_n] = M_n$ for all $n$ (martingale property)

Submartingale: condition 3 is replaced by $\mathbb{E}[M_{n+1} \mid \mathcal{F}_n] \geq M_n$ (expected upward drift).

Supermartingale: condition 3 is replaced by $\mathbb{E}[M_{n+1} \mid \mathcal{F}_n] \leq M_n$ (expected downward drift).

The martingale property says: the best prediction of the future value, given all information up to now, is the current value. No trend, no drift --- a "fair game."

Definition

Martingale Difference Sequence $(D_{n})$

A sequence $(D_n)_{n \geq 1}$ is a martingale difference sequence (MDS) with respect to $(\mathcal{F}_n)$ if and only if:

$\mathbb{E}[D_n \mid \mathcal{F}_{n-1}] = 0 \quad \text{for all } n \geq 1$

If $(M_n)$ is a martingale, then $D_n = M_n - M_{n-1}$ is an MDS. The martingale is the cumulative sum: $M_n = M_0 + \sum_{k=1}^n D_k$ .

MDSs are the martingale analogue of i.i.d. mean-zero random variables, but with two key differences: the $D_n$ are allowed to be dependent, and their conditional variance can change over time.

The Doob Martingale

Definition

Doob Martingale $Z_{i} = E [f (X) ∣ X_{1}, \dots, X_{i}]$

Let $X = (X_1, \ldots, X_n)$ be a random vector and $f: \mathcal{X}^n \to \mathbb{R}$ a function with $\mathbb{E}[|f(X)|] < \infty$ . The Doob martingale (or exposure martingale) is:

$Z_i = \mathbb{E}[f(X_1, \ldots, X_n) \mid X_1, \ldots, X_i], \quad i = 0, 1, \ldots, n$

with $Z_0 = \mathbb{E}[f(X)]$ and $Z_n = f(X)$ .

This is a martingale: $\mathbb{E}[Z_{i+1} \mid X_1, \ldots, X_i] = Z_i$ by the tower property of conditional expectation.

The Doob martingale is the fundamental construction for proving concentration inequalities. It "reveals" the random variables one at a time: $Z_0$ is the unconditional expectation (no information), $Z_n$ is the actual value (full information), and each $Z_i$ adds one variable's worth of information.

The martingale differences are $D_i = Z_i - Z_{i-1} = \mathbb{E}[f(X) \mid X_1, \ldots, X_i] - \mathbb{E}[f(X) \mid X_1, \ldots, X_{i-1}]$ . These measure how much the conditional expectation of $f(X)$ changes when $X_i$ is revealed. If $f$ has bounded differences ( $|D_i| \leq c_i$ ), the Azuma-Hoeffding inequality gives exponential concentration.

Main Theorems

Theorem

Azuma-Hoeffding Inequality

Statement

Let $(M_n)$ be a martingale with bounded increments: $|M_k - M_{k-1}| \leq c_k$ almost surely for $k = 1, \ldots, n$ . Then for any $t > 0$ :

$\Pr[M_n - M_0 \geq t] \leq \exp\!\left(-\frac{t^2}{2\sum_{k=1}^n c_k^2}\right)$

$\Pr[|M_n - M_0| \geq t] \leq 2\exp\!\left(-\frac{t^2}{2\sum_{k=1}^n c_k^2}\right)$

If all $c_k = c$ (equal bounds), this simplifies to $\Pr[|M_n - M_0| \geq t] \leq 2\exp(-t^2/(2nc^2))$ .

Intuition

A martingale with bounded increments (each step moves by at most $c_k$ ) cannot deviate far from its starting point with high probability. The deviation is controlled by $\sqrt{\sum c_k^2}$ --- the "diffusion scale" of the random walk. This is the martingale analogue of Hoeffding's inequality for sums of independent bounded random variables.

Proof Sketch

Use the exponential supermartingale technique:

Step 1: For any $\lambda > 0$ , define $Y_n = \exp(\lambda M_n - \psi(\lambda, n))$ where $\psi$ is chosen so that $(Y_n)$ is a supermartingale.

Step 2: Since $|D_k| \leq c_k$ and $\mathbb{E}[D_k \mid \mathcal{F}_{k-1}] = 0$ , Hoeffding's lemma gives $\mathbb{E}[e^{\lambda D_k} \mid \mathcal{F}_{k-1}] \leq e^{\lambda^2 c_k^2/2}$ .

Therefore $\mathbb{E}[Y_n \mid \mathcal{F}_{n-1}] \leq Y_{n-1}$ with $\psi(\lambda, n) = \frac{\lambda^2}{2}\sum_{k=1}^n c_k^2$ .

Step 3: Apply Markov's inequality: $\Pr[M_n - M_0 \geq t] = \Pr[e^{\lambda(M_n - M_0)} \geq e^{\lambda t}] \leq e^{-\lambda t} \cdot \mathbb{E}[e^{\lambda(M_n - M_0)}] \leq e^{-\lambda t + \lambda^2 \sum c_k^2/2}$ .

Step 4: Optimize over $\lambda$ : set $\lambda = t/\sum c_k^2$ to get $\exp(-t^2/(2\sum c_k^2))$ .

Why It Matters

Azuma-Hoeffding is the engine behind McDiarmid's inequality, but the two constants are not obtained by naive chaining. McDiarmid's inequality is derived from Azuma-Hoeffding applied to the Doob martingale $M_i = \mathbb{E}[f(X) \mid X_1, \ldots, X_i]$ . Under the bounded-differences condition, the Doob increments $D_i = M_i - M_{i-1}$ satisfy the conditional range condition $\sup_{x_i, x_i'} |D_i|_{X_{i-1} = x_{i-1}} \leq c_i$ , not merely $|D_i| \leq c_i$ . Hoeffding's lemma for bounded-range random variables gives a factor-of-4 improvement over the naive absolute-value bound, yielding the McDiarmid constant:

$\Pr[f(X) - \mathbb{E}[f(X)] \geq t] \leq \exp\!\left(-\frac{2t^2}{\sum c_i^2}\right)$

This is NOT the same as the Azuma constant $\exp(-t^2/(2\sum c_k^2))$ ; they differ by a factor of 4 inside the exponent. A student who naively plugs $c_k = c_i$ into the Azuma bound above will get a constant that is 4x too loose. This is how generalization bounds, Rademacher complexity concentration, and many other ML results are proved.

Failure Mode

Azuma-Hoeffding uses only the worst-case increment bound $c_k$ . If the increments are typically much smaller than $c_k$ (small variance but bounded range), the bound is loose. The Freedman inequality below is tighter in this case because it uses the variance of the increments.

report a correction →

Theorem

Doob's Martingale Convergence Theorem

Statement

Let $(M_n)_{n \geq 0}$ be a supermartingale with $\sup_n \mathbb{E}[M_n^-] < \infty$ (in particular, a non-negative supermartingale is a special case since $M_n^- = 0$ ). Then there exists a random variable $M_\infty$ with $\mathbb{E}[|M_\infty|] < \infty$ such that:

$M_n \to M_\infty \quad \text{almost surely}$

For a martingale bounded in $L^2$ ( $\sup_n \mathbb{E}[M_n^2] < \infty$ ), the convergence also holds in $L^1$ : $\mathbb{E}[|M_n - M_\infty|] \to 0$ .

Intuition

A non-negative supermartingale is a "wealth process" for a gambler playing unfavorable games: expected wealth decreases over time, and the wealth cannot go below zero. Such a process must eventually settle down --- it cannot keep fluctuating forever because it has no expected upward drift and it is bounded below. The almost-sure limit captures the gambler's long-run fortune.

For bounded $L^2$ martingales, the convergence is stronger: the sequence not only converges pointwise but the expected deviation from the limit goes to zero.

Proof Sketch

The proof uses Doob's upcrossing inequality: let $U_n(a, b)$ be the number of times the sequence $M_0, M_1, \ldots, M_n$ crosses upward from below $a$ to above $b$ (with $a < b$ ). Then:

$\mathbb{E}[U_n(a, b)] \leq \frac{\mathbb{E}[(M_n - a)^-]}{b - a}$

If $\sup_n \mathbb{E}[M_n^-] < \infty$ , the right side is bounded uniformly in $n$ , so $U_\infty(a, b) < \infty$ a.s. for all rational $a < b$ .

A sequence with finitely many upcrossings of every interval $(a, b)$ must converge (it cannot oscillate between any two values infinitely often). Therefore $M_n \to M_\infty$ a.s.

Why It Matters

In ML theory, martingale convergence appears in:

Stochastic approximation: in the Robbins-Monro framework, SGD with step sizes satisfying $\sum_t \eta_t = \infty$ and $\sum_t \eta_t^2 < \infty$ has iterates that, under standard regularity, generate a non-negative almost-supermartingale to which the Robbins-Siegmund (1971) lemma applies, yielding almost-sure convergence to a stationary point
Online learning: the regret of certain algorithms, properly normalized, is a supermartingale that converges, giving per-round regret bounds
Bayesian consistency: posterior beliefs, viewed as a martingale indexed by the amount of data, converge to the truth under regularity conditions

Failure Mode

Almost-sure convergence does not imply $L^1$ convergence in general. The classic counterexample: let $M_n$ be 0 with probability $1 - 1/n$ and $n$ with probability $1/n$ , arranged so $\mathbb{E}[M_n] = 1$ for all $n$ . Then $M_n \to 0$ a.s. But $\mathbb{E}[M_n] = 1 \not\to 0$ . For $L^1$ convergence, you need uniform integrability.

report a correction →

Theorem

Doob's Weak Maximal Inequality

Statement

Let $(X_n)$ be a non-negative submartingale. For any $\lambda > 0$ :

$\Pr\!\left(\max_{0 \leq k \leq n} X_k \geq \lambda\right) \leq \frac{\mathbb{E}[X_n]}{\lambda}$

Intuition

This is the martingale analogue of Markov's inequality. Markov's inequality bounds the probability that a single non-negative random variable exceeds a threshold by its mean divided by the threshold. Doob's weak maximal inequality gives the same kind of tail control for the running maximum of a submartingale, using only the terminal expectation.

Why It Matters

The weak maximal form is the version used when a proof needs a uniform-in-time tail bound rather than a fixed-time tail bound. It is a standard bridge from terminal martingale control to anytime-valid arguments, stopping-time estimates, and concentration proofs where the path's supremum matters.

Failure Mode

The non-negativity and submartingale hypotheses are doing real work. Without them, the running maximum can be large even when the terminal value has small expectation, so the Markov-style ratio no longer controls the event. This weak inequality controls a tail probability; it does not by itself give an $L^p$ norm bound for the maximum.

report a correction →

Theorem

Doob's L^p Maximal Inequality

Statement

Let $(X_n)$ be a non-negative submartingale. For $p > 1$ , the $L^p$ maximal inequality gives:

$\mathbb{E}\!\left[\left(\max_{0 \leq k \leq n} X_k\right)^p\right] \leq \left(\frac{p}{p-1}\right)^p \mathbb{E}[X_n^p]$

Intuition

The weak maximal inequality controls the probability of crossing one threshold. The $L^p$ maximal inequality upgrades that into moment control: if the terminal variable has a finite $p$ th moment, then the whole path's maximum has a finite $p$ th moment with an explicit constant.

Why It Matters

The $L^p$ form is the stronger norm estimate used in convergence theorems, empirical-process arguments, and stochastic analysis. It turns terminal $L^p$ control into pathwise supremum control, which is exactly what many uniform-in-time estimates need.

Failure Mode

The constant $(p/(p-1))^p$ diverges as $p \downarrow 1$ , which is why this is not an $L^1$ maximal theorem. At $p = 1$ , the right replacement is the weak maximal inequality above, not a strong $L^1$ bound with the same shape.

report a correction →

Theorem

Optional Stopping Theorem

Statement

Let $(M_n)$ be a martingale and $\tau$ a stopping time with respect to $(\mathcal{F}_n)$ . If $\tau \leq N$ almost surely for some constant $N$ (bounded stopping time), then:

$\mathbb{E}[M_\tau] = \mathbb{E}[M_0]$

A useful sufficient condition for unbounded stopping times is: (a) $\mathbb{E}[\tau] < \infty$ , and (b) there exists $c$ such that $\mathbb{E}[|M_{n+1} - M_n| \mid \mathcal{F}_n] \leq c$ , which imply $\mathbb{E}[M_\tau] = \mathbb{E}[M_0]$ (see Williams, Probability with Martingales §10.10). Other non-equivalent sufficient conditions include: (i) uniform integrability of the stopped sequence $(M_{\tau \wedge n})$ , or (ii) a dominating bound $|M_{\tau \wedge n}| \leq Y$ with $\mathbb{E}[Y] < \infty$ (dominated convergence). None of these is strictly "more general" than the others.

Intuition

You cannot beat a fair game by choosing when to stop. If the game is fair at every step ( $\mathbb{E}[M_{n+1} | \mathcal{F}_n] = M_n$ ), then your expected payoff when you stop equals your starting capital, no matter how cleverly you choose when to stop (as long as you must stop in bounded time).

The caveat "bounded time" is critical. With unbounded stopping times, you can have $\mathbb{E}[M_\tau] \neq \mathbb{E}[M_0]$ --- the classic example is the doubling strategy in gambling, which requires infinite credit.

Proof Sketch

For a bounded stopping time $\tau \leq N$ :

$\mathbb{E}[M_\tau] = \mathbb{E}[M_N] - \mathbb{E}\!\left[\sum_{k=\tau+1}^{N}(M_k - M_{k-1})\right]$

Since $\tau$ is a stopping time, the event $\{\tau < k\}$ is $\mathcal{F}_{k-1}$ -measurable, and:

$\mathbb{E}[(M_k - M_{k-1})\mathbf{1}_{\tau < k} \mid \mathcal{F}_{k-1}] = \mathbf{1}_{\tau < k} \cdot \mathbb{E}[M_k - M_{k-1} \mid \mathcal{F}_{k-1}] = 0$

So $\mathbb{E}[M_\tau] = \mathbb{E}[M_N] = \mathbb{E}[M_0]$ .

Why It Matters

Optional stopping is used in sequential analysis and online learning where you must decide when to stop based on observed data. In sequential hypothesis testing (Wald's SPRT), the likelihood ratio process is a martingale, and optional stopping gives error probability guarantees. In anytime-valid confidence sequences, martingale constructions provide confidence bounds that hold at arbitrary data- dependent stopping times.

Failure Mode

The boundedness condition is not a technicality. The simple random walk $M_n = \sum_{i=1}^n X_i$ (with $X_i = \pm 1$ equally likely) is a martingale with $\mathbb{E}[M_0] = 0$ . Let $\tau = \inf\{n : M_n = 1\}$ . Then $M_\tau = 1$ always, so $\mathbb{E}[M_\tau] = 1 \neq 0 = \mathbb{E}[M_0]$ . The problem: $\tau$ is not bounded (and $\mathbb{E}[\tau] = \infty$ ).

report a correction →

The Freedman Inequality (Variance-Sensitive)

Theorem

Freedman Inequality

Statement

Let $(M_n)$ be a martingale with $|D_k| = |M_k - M_{k-1}| \leq R$ . Define the predictable quadratic variation:

$W_n = \sum_{k=1}^n \mathbb{E}[D_k^2 \mid \mathcal{F}_{k-1}] = \sum_{k=1}^n \text{Var}(D_k \mid \mathcal{F}_{k-1})$

Then for any $t > 0$ and $v > 0$ :

$\Pr[M_n - M_0 \geq t \text{ and } W_n \leq v] \leq \exp\!\left(-\frac{t^2/2}{v + Rt/3}\right)$

Intuition

Azuma-Hoeffding uses only the worst-case increment bound $R$ . But if the increments are typically much smaller (low conditional variance), the martingale concentrates much better than Azuma-Hoeffding predicts. Freedman's inequality captures this by depending on the actual cumulative conditional variance $W_n$ rather than the worst-case bound $nR^2$ .

When $v \gg Rt$ : the bound becomes $\exp(-t^2/(2v))$ , a Gaussian-type tail (sub-Gaussian with parameter $v$ ). When $Rt \gg v$ : the bound becomes $\exp(-3t/(2R))$ , a Poisson-type tail (sub-exponential).

Proof Sketch

The proof follows the same exponential supermartingale strategy as Azuma-Hoeffding, but with a tighter bound on the moment generating function of the increments. Instead of using Hoeffding's lemma (which only uses the range), use Bennett's inequality for the MGF:

$\mathbb{E}[e^{\lambda D_k} \mid \mathcal{F}_{k-1}] \leq \exp\!\left(\frac{\text{Var}(D_k \mid \mathcal{F}_{k-1}) \cdot \phi(\lambda R)}{R^2}\right)$

where $\phi(u) = e^u - u - 1$ . Summing the exponents over $k$ gives a bound in terms of $W_n$ instead of $nR^2$ . The optimization $\lambda^* = \log(1 + Rt / v) / R$ yields the final form $\exp(-t^2 / (2(v + Rt/3)))$ via the inequality $\phi(u) \leq u^2 / (2(1 - u/3))$ for $u > 0$ .

Why It Matters

Freedman's inequality is the variance-sensitive analogue of Azuma-Hoeffding. It is tighter whenever the conditional variances are much smaller than the worst-case increment bounds --- which is common in practice.

In ML theory: when proving regret bounds for online learning algorithms, the per-round loss differences are bounded but often have low variance (the algorithm makes mostly good predictions). Freedman gives tighter regret bounds than Azuma in these cases.

Failure Mode

The bound requires knowing (or bounding) the predictable quadratic variation $W_n$ , which may itself be random. In practice, you often bound $W_n$ by a deterministic quantity and then the Freedman bound reduces to a variance-sensitive version of Azuma.

report a correction →

Martingale Representation Theorem

In continuous time, martingales adapted to a Brownian filtration have a remarkably clean structure. If $(M_t)_{0 \leq t \leq T}$ is a continuous square-integrable martingale adapted to the filtration generated by a Brownian motion $(B_t)$ , then there exists a unique adapted process $\phi$ with $\mathbb{E}\int_0^T \phi_s^2 \, ds < \infty$ such that

$M_t = M_0 + \int_0^t \phi_s \, dB_s$

Every Brownian-filtration martingale is a stochastic integral against Brownian motion. This is foundational for mathematical finance (every contingent claim on a complete market can be replicated by a dynamic trading strategy) and for optimal stopping and filtering theory. The theorem connects martingale theory to stochastic calculus: the integrand $\phi_s$ plays the role of an instantaneous hedging ratio, and the stochastic integral provides the mechanism for reconstructing the martingale from Brownian increments.

Proof Ideas and Templates Used

Martingale proofs in ML theory follow several standard patterns:

Doob martingale + Azuma-Hoeffding: to prove concentration of a function $f(X_1, \ldots, X_n)$ , construct the Doob martingale, bound the increments, and apply Azuma. This is how McDiarmid's inequality is proved.
Exponential supermartingale: define $Y_n = \exp(\lambda M_n - \psi_n)$ and show it is a supermartingale. This is the universal technique for deriving concentration inequalities from martingale bounds.
Stopping time arguments: to prove properties of adaptive procedures (online learning, sequential testing), formulate the quantity of interest as a martingale and apply optional stopping or convergence theorems.

Canonical Examples

Example

Doob martingale for the empirical mean

Let $X_1, \ldots, X_n$ be i.i.d. with $\mathbb{E}[X_i] = \mu$ and $X_i \in [a, b]$ . Let $f(X_1, \ldots, X_n) = \frac{1}{n}\sum_i X_i$ .

The Doob martingale is: $Z_k = \mathbb{E}[f \mid X_1, \ldots, X_k] = \frac{1}{n}(\sum_{i=1}^k X_i + (n-k)\mu)$ .

The increment $D_k = Z_k - Z_{k-1} = \frac{1}{n}(X_k - \mu)$ , which satisfies $|D_k| \leq (b-a)/n$ . By Azuma-Hoeffding:

$\Pr[|\bar{X} - \mu| \geq t] \leq 2\exp(-2nt^2/(b-a)^2)$

This recovers Hoeffding's inequality via the martingale route.

Common Confusions

Watch Out

Martingale is not the same as i.i.d. sum

A sum of i.i.d. mean-zero random variables is a martingale, but not all martingales are i.i.d. sums. The increments $D_k = M_k - M_{k-1}$ can depend on the past through $\mathcal{F}_{k-1}$ . What is required is only that $\mathbb{E}[D_k \mid \mathcal{F}_{k-1}] = 0$ --- the conditional mean is zero. The conditional variance, distribution, and higher moments can all depend on the past.

Watch Out

Optional stopping requires conditions

The conclusion $\mathbb{E}[M_\tau] = \mathbb{E}[M_0]$ does not hold for all stopping times. The stopping time must be bounded, or more general conditions (finite expectation + bounded increments) must hold. Ignoring this leads to paradoxes (doubling strategy, St. Petersburg paradox).

Watch Out

Convergence in L1 is stronger than a.s. convergence for martingales

Doob's convergence theorem gives a.s. convergence for supermartingales. But the limit $M_\infty$ may satisfy $\mathbb{E}[M_\infty] < \mathbb{E}[M_0]$ (the expectation can "leak" to infinity). For $\mathbb{E}[M_n] \to \mathbb{E}[M_\infty]$ , you need uniform integrability, which is a strictly stronger condition.

Summary

Martingale: $\mathbb{E}[M_{n+1} \mid \mathcal{F}_n] = M_n$ (fair game)
Doob martingale: $Z_i = \mathbb{E}[f(X) \mid X_1, \ldots, X_i]$ exposes variables one at a time
Azuma-Hoeffding: bounded-increment martingale concentrates: $\Pr[|M_n - M_0| \geq t] \leq 2\exp(-t^2/(2\sum c_k^2))$
Freedman: variance-sensitive refinement, uses $W_n = \sum \text{Var}(D_k \mid \mathcal{F}_{k-1})$
Optional stopping: $\mathbb{E}[M_\tau] = \mathbb{E}[M_0]$ for bounded stopping times
Doob convergence: bounded supermartingales converge a.s.
McDiarmid's inequality = Doob martingale + Azuma-Hoeffding
Martingales handle dependent sequences, not just i.i.d. sums

Exercises

ExerciseCore

Problem

Let $X_1, X_2, \ldots$ be i.i.d. with $\mathbb{E}[X_i] = 0$ and $|X_i| \leq 1$ . Define $S_n = \sum_{i=1}^n X_i$ . Show that $(S_n)$ is a martingale with respect to the natural filtration and apply Azuma-Hoeffding to bound $\Pr[S_n \geq t]$ .

ExerciseAdvanced

Problem

Use the Doob martingale and Azuma-Hoeffding to prove McDiarmid's inequality: if $f$ satisfies the bounded differences condition $\sup_{x_i'} |f(x_1, \ldots, x_i, \ldots, x_n) - f(x_1, \ldots, x_i', \ldots, x_n)| \leq c_i$ for each $i$ , then for independent $X_1, \ldots, X_n$ :

$\Pr[f(X) - \mathbb{E}[f(X)] \geq t] \leq \exp\!\left(-\frac{2t^2}{\sum_{i=1}^n c_i^2}\right)$

ExerciseResearch

Problem

Compare Azuma-Hoeffding and Freedman for the following setting: a martingale with $n = 1000$ steps, bounded increments $|D_k| \leq 1$ , but conditional variance $\text{Var}(D_k \mid \mathcal{F}_{k-1}) = 0.01$ for all $k$ (almost all the increment's distribution is concentrated near zero). Compute the Azuma-Hoeffding and Freedman bounds for $\Pr[M_n - M_0 \geq 10]$ and comment on the difference.

Related Comparisons

References

Canonical:

Durrett, Probability: Theory and Examples (5th ed., 2019), Chapter 4
Williams, Probability with Martingales (1991) --- the standard introductory text

Current:

Wainwright, High-Dimensional Statistics (2019), Chapter 2.5
Freedman, "On Tail Probabilities for Martingales" (Annals of Probability, 1975)
de la Pena, "A General Class of Exponential Inequalities for Martingales and Ratios" (Annals of Probability, 1999)

Next Topics

From martingale theory, the natural next steps are:

McDiarmid's inequality: the direct application of Doob + Azuma-Hoeffding to bounded-difference functions
Concentration inequalities: the broader toolkit that martingales support
Stochastic calculus for ML: the continuous-time extension of martingale theory, with applications to diffusion models and SGD analysis

Last reviewed: April 26, 2026

Claim evidence

Selected claims on this topic have machine-checked support.

Collapsed by default because this is audit material. Open it to see exact theorem names, claim scopes, and source roles.

Click to expand

1

claim matches

1

dependency proofs

0

incomplete markers

This is claim-level evidence, not a whole-page badge. The checked theorem must match the recorded claim scope. Supporting lemmas stay labeled as dependency proofs, not full claim matches.

Azuma Hoeffding

Formal support is recorded for this governed claim.

Dependency proof

Formal record

Checked theorem: TheoremPath.Probability.Concentration.azumaHoeffdingConditionalSubGaussianTail
Claim scope: finite martingale difference conditional subgaussian one sided tail
Proof scope: scoped mathlib wrapper for conditional subgaussian azuma hoeffding tail
Mathlib theorem: ProbabilityTheory.measure_sum_ge_le_of_hasCondSubgaussianMGF

Doob Weak Maximal Inequality

Formal support is recorded for this governed claim.

Matches claim

Formal record

Checked theorem: TheoremPath.Probability.Martingale.doobWeakMaximalInequality
Claim scope: nonnegative submartingale weak maximal inequality
Proof scope: exact mathlib wrapper for doob weak maximal inequality
Mathlib theorem: MeasureTheory.maximal_ineq

See the public evidence page for the display rules and representative Lean mapping examples.

Canonical graph

Required before and derived from this topic

These links come from prerequisite edges in the curriculum graph. Editorial suggestions are shown here only when the target page also cites this page as a prerequisite.

Full prerequisite chain All derived topics

Required prerequisites

1

Measure-Theoretic Probabilitylayer 0B · tier 1

Derived topics

9

Concentration Inequalitieslayer 1 · tier 1
Anytime-Valid Inferencelayer 3 · tier 1
e-processeslayer 3 · tier 1
E-Values and Anytime-Valid Inferencelayer 3 · tier 1
McDiarmid's Inequalitylayer 3 · tier 1

+4 more on the derived-topics page.

Graph-backed continuations

McDiarmid's Inequality Concentration Inequalities Stochastic Approximation Theory Adaptive Learning Is Not IID Coupling Arguments and Mixing Time E-Values and Anytime-Valid Inference Stochastic Calculus for ML