e-processes

Sneiderman, Robby

Sequential Inference

e-processes

The sequential version of an e-value: a nonnegative process whose value at every stopping time is an e-value for the null. Constructed as the running product of conditional e-values adapted to a filtration; equivalently, a nonnegative supermartingale under the null. The object that powers anytime-valid inference, sequential testing, and confidence sequences.

AdvancedResearchTier 1CurrentCore spine~55 min

For:MLStatsResearch

Prerequisites

E Values Martingale Theory Measure Theoretic Probability Stochastic Processes ML

Prereq Map

Why This Matters

An e-value is a single nonnegative statistic with $\mathbb{E}_{H_0}[E] \leq 1$ . An e-process is the sequential version: a collection $(E_t)_{t \geq 0}$ where each $E_t$ is an e-value with respect to the information available at time $t$ , and the construction is consistent across times so that the value at any stopping time $\tau$ is still an e-value. The construction lets a tester accumulate evidence, peek as often as they like, stop whenever the evidence is convincing, and retain the same Type I error guarantee they would have had at a single fixed sample size.

The technical bridge is one of the cleanest results in modern statistics. If $E_t$ is a conditional e-value at time $t$ given the past, the running product $\prod_{s=1}^t E_s$ is a nonnegative supermartingale under the null. Ville's inequality bounds the maximum of a nonnegative supermartingale: $\Pr_{H_0}(\sup_t \prod E_t \geq 1/\alpha) \leq \alpha$ . The supremum is over all times, including any stopping time the analyst might choose adaptively. Optional stopping, optional continuation, and data-dependent decision rules all preserve the bound.

This is the technical content of "anytime-valid inference." Every confidence sequence, every always-valid p-value, every safe sequential test in the modern literature is a special case of the e-process construction, dating back to Wald's 1945 SPRT (likelihood-ratio e-process between two simple hypotheses) and reaching forward to LLM evaluation pipelines and continuous-monitoring clinical trials.

Formal Setup

Definition

Filtration $(F_{t})$

A nondecreasing family of $\sigma$ -algebras $\mathcal{F}_0 \subseteq \mathcal{F}_1 \subseteq \cdots$ on a probability space, with $\mathcal{F}_t$ representing the information available at time $t$ . In practice, $\mathcal{F}_t = \sigma(X_1, \ldots, X_t)$ for an observed sequence.

Definition

Adapted process

A stochastic process $(Y_t)$ is adapted to $(\mathcal{F}_t)$ if $Y_t$ is $\mathcal{F}_t$ -measurable for every $t$ . Equivalently, $Y_t$ is a function of the observations through time $t$ only.

Definition

e-process $(E_{t})$

A nonnegative process $(E_t)_{t \geq 0}$ adapted to the filtration $(\mathcal{F}_t)$ , with $E_0 = 1$ , such that for every stopping time $\tau$ (possibly infinite), $\sup_{P \in H_0} \mathbb{E}_P[E_\tau \cdot \mathbf{1}\{\tau < \infty\}] \leq 1.$ A test that rejects $H_0$ when $E_\tau \geq 1/\alpha$ has Type I error at most $\alpha$ , by Markov's inequality applied to $E_\tau$ .

Definition

Stopping time

A random variable $\tau : \Omega \to \{0, 1, \ldots\} \cup \{\infty\}$ such that the event $\{\tau \leq t\}$ is $\mathcal{F}_t$ -measurable. Intuitively, the decision to stop at time $t$ depends only on the information available by time $t$ .

Construction From Conditional e-values

The standard recipe builds an e-process from a sequence of conditional e-values. The constraint is that each new term has unit conditional expectation given the past.

Theorem

Running Product of Conditional e-values Is a Supermartingale

Statement

Let $(B_t)_{t \geq 1}$ be a sequence of nonnegative random variables adapted to $(\mathcal{F}_t)$ with $\mathbb{E}_{H_0}[B_t \mid \mathcal{F}_{t-1}] \leq 1$ for every $t$ . Define $E_0 = 1$ and $E_t = E_{t-1} \cdot B_t$ for $t \geq 1$ . Then $(E_t)$ is a nonnegative supermartingale under $H_0$ adapted to $(\mathcal{F}_t)$ , and $\mathbb{E}_{H_0}[E_t] \leq 1$ for every $t$ .

Consequently, $(E_t)$ is an e-process, and for every stopping time $\tau$ , $\mathbb{E}_{H_0}[E_\tau \cdot \mathbf{1}\{\tau < \infty\}] \leq 1.$

Intuition

Each $B_t$ is a conditional bet: a payoff function with unit expected value given everything seen so far. Wealth multiplies across rounds. The conditional unit-expectation property is the martingale-like constraint that survives compounding. Equality at every step gives a martingale; an inequality gives a supermartingale, which is fine because we only need an upper bound on expected wealth.

Proof Sketch

Compute the conditional expectation of $E_t$ given $\mathcal{F}_{t-1}$ : $\mathbb{E}_{H_0}[E_t \mid \mathcal{F}_{t-1}] = \mathbb{E}_{H_0}[E_{t-1} B_t \mid \mathcal{F}_{t-1}] = E_{t-1} \cdot \mathbb{E}_{H_0}[B_t \mid \mathcal{F}_{t-1}] \leq E_{t-1}.$ The factoring of $E_{t-1}$ out of the conditional expectation uses $\mathcal{F}_{t-1}$ -measurability. The supermartingale property is the displayed inequality. The unconditional bound follows from the tower property: $\mathbb{E}_{H_0}[E_t] = \mathbb{E}_{H_0}[\mathbb{E}_{H_0}[E_t \mid \mathcal{F}_{t-1}]] \leq \mathbb{E}_{H_0}[E_{t-1}] \leq \cdots \leq 1$ .

For the stopping-time statement, the optional stopping theorem (Doob) for nonnegative supermartingales gives $\mathbb{E}_{H_0}[E_\tau \mathbf{1}\{\tau < \infty\}] \leq \mathbb{E}_{H_0}[E_0] = 1$ for any stopping time $\tau$ .

Why It Matters

This construction is the engine of every modern sequential test. Wald's SPRT is the special case where $B_t = p_1(X_t)/p_0(X_t)$ (the likelihood ratio of one observation), and the running product is the joint likelihood ratio. Confidence sequences are built from running products of inverted-bet e-processes. The Sequential Hypothesis Testing pipeline at large-scale A/B platforms (Optimizely, Statsig) reduces to this theorem applied to bounded-mean estimators.

Failure Mode

The bound requires the conditional unit-expectation property under the null. If the analyst chooses $B_t$ based on data not in $\mathcal{F}_{t-1}$ (e.g. by looking ahead, or by retrospectively reordering observations), the construction fails. The "adapted to $\mathcal{F}_{t-1}$ " requirement is non-negotiable. Reusing the same observations across multiple $B_t$ definitions also breaks the conditional expectation, even though the resulting process is still adapted.

report a correction →

Optional ProofVille's inequality for nonnegative supermartingalesShow

The single-shot Markov inequality for an e-value generalizes to a maximal inequality for the entire e-process trajectory.

Theorem

Ville's Inequality

Statement

Let $(E_t)_{t \geq 0}$ be a nonnegative supermartingale under $P$ with $E_0 \leq 1$ . Then for every $c > 0$ , $P\!\left[\sup_{t \geq 0} E_t \geq c\right] \leq \frac{1}{c}.$ Setting $c = 1/\alpha$ gives the anytime-valid Type I error guarantee: under the null, the probability that the e-process ever exceeds $1/\alpha$ is at most $\alpha$ .

Proof Sketch

Define $\tau = \inf\{t \geq 0 : E_t \geq c\}$ , a stopping time. On the event $\{\sup_t E_t \geq c\}$ , $\tau < \infty$ and $E_\tau \geq c$ . By the optional stopping theorem, $\mathbb{E}[E_\tau \mathbf{1}\{\tau < \infty\}] \leq \mathbb{E}[E_0] \leq 1.$ On the complement, the bound is trivial. Apply Markov: $P(\tau < \infty) \cdot c \leq \mathbb{E}[E_\tau \mathbf{1}\{\tau < \infty\}] \leq 1,$ so $P(\sup_t E_t \geq c) = P(\tau < \infty) \leq 1/c$ .

Why It Matters

The supremum is over all times. Ville's inequality says optional stopping at any time, including the time that maximizes the e-process value, cannot inflate Type I error beyond $\alpha$ . This is the technical content of "anytime-valid."

Failure Mode

Ville's inequality requires the uniform supermartingale property: $\mathbb{E}[E_t \mid \mathcal{F}_{t-1}] \leq E_{t-1}$ for every $t$ , not just $\mathbb{E}[E_t] \leq 1$ unconditionally. Mixing e-values from different filtrations or recomputing the conditional expectation only at certain times breaks the bound.

report a correction →

Canonical Example: Wald's SPRT

The sequential probability ratio test (Wald 1945) is the prototype e-process. Test $H_0: P_0$ against $H_1: P_1$ , both simple, observe $X_1, X_2, \ldots$ iid. Define $E_t = \prod_{i=1}^t \frac{p_1(X_i)}{p_0(X_i)}.$ Each likelihood ratio is a conditional (in fact unconditional, by independence) e-value: $\mathbb{E}_{P_0}[p_1(X_i)/p_0(X_i)] = 1$ . The running product is a nonnegative supermartingale under $H_0$ (actually a martingale, since the inequality is an equality).

Wald's original test stops the first time $E_t \geq A$ (reject $H_0$ ) or $E_t \leq B$ (accept $H_0$ ), with $A = 1/\alpha$ and $B = \beta$ for desired error rates. The Type I error guarantee follows from Ville: under $H_0$ , $\Pr(\sup_t E_t \geq A) \leq 1/A = \alpha$ . The expected sample size to reject under $H_1$ scales like $\log(1/\alpha)/\mathrm{KL}(P_1 \| P_0)$ , the Kullback-Leibler rate.

The same construction works for composite nulls via universal inference (Wasserman-Ramdas-Balakrishnan 2020): plug in any density estimator for $P_1$ on a held-out half of the data, evaluate the likelihood ratio on the other half. The result is an e-process for any composite null.

Worked Exercise

ExerciseAdvanced

Problem

Let $X_1, X_2, \ldots$ be iid $\mathcal{N}(\theta, 1)$ and consider testing $H_0: \theta = 0$ against $H_1: \theta = \mu$ for fixed $\mu > 0$ . Construct the likelihood-ratio e-process explicitly. Compute its value at $t = 100$ when $\bar X_{100} = 0.21$ and $\mu = 0.2$ . At what time does the e-process exceed $1/\alpha = 20$ if the true mean is exactly $\mu = 0.2$ and $\bar X_t \approx 0.2$ for all $t$ ?

Practical Example: Hourly A/B Test Monitoring

A standard A/B test compares conversion rates of two website variants over a planned $N$ days of traffic. The product team wants to peek hourly and stop early if the result is convincing. With classical $z$ -test $p$ -values, peeking inflates Type I error. With an e-process:

Define $H_0: \text{conversion rate of variant A} \leq \text{conversion rate of variant B}$ .
For each hour $t$ , the streaming users contribute conditional bets $B_t$ based on the observed click-rate differential and the running variance estimate. Empirical-Bernstein constructions (Howard-Ramdas-McAuliffe-Sekhon 2021) give explicit formulas for $B_t$ with bounded outcomes.
The running product $E_t = \prod_{s \leq t} B_s$ is the e-process. Stop the test at the first $t$ where $E_t \geq 20$ (for $\alpha = 0.05$ ) or at the budget $T_{\max}$ , whichever comes first.

Ville's inequality guarantees that under $H_0$ , the probability of falsely rejecting at any time is at most $5\%$ , regardless of how many hourly looks the analyst performed. Confidence sequences (the next topic) provide the matching interval estimates.

Connection to Classical Tests

Every classical sequential test maps onto an e-process construction:

Wald's SPRT: likelihood-ratio e-process with two-sided thresholds.
Group-sequential tests (Pocock, O'Brien-Fleming): e-process is implicit; the boundaries are tuned to the fixed schedule of looks. Adaptive peeking outside the schedule still breaks them; the e-process framing handles arbitrary stopping.
Always-valid p-values: defined as $1/\sup_{s \leq t} E_s$ , which inherits Ville's inequality.
Bayes factors: under a prior, the Bayes factor is a likelihood ratio averaged over the prior, which is again an e-value. The product form across observations gives an e-process.

The modern view (Ramdas, Grünwald, Vovk, Shafer 2023) is that all sequential tests are e-processes under different choices of betting strategy.

Implementation Note

The standard implementation pattern is to maintain the log-e-process for numerical stability:

log_E = 0.0
for x_t in stream:
    log_B_t = compute_log_conditional_bet(x_t, history)
    log_E += log_B_t
    if log_E >= np.log(1 / alpha):
        return "reject H_0", t
    history.append(x_t)

The confseq Python package (Howard et al. 2021) provides ready-made empirical-Bernstein e-processes for bounded outcomes. For Gaussian observations with unknown variance, the Robbins (1970) mixture e-process is standard; for two-sample binomial tests, the betting construction from Waudby-Smith and Ramdas (2024) gives the tightest confidence sequences in the bounded-outcome regime.

Care must be taken with the filtration. Each $B_t$ must depend only on observations strictly before time $t$ (or, by convention, on $X_1, \ldots, X_{t-1}$ and any external randomness independent of $X_t$ ). Using the current observation $X_t$ inside $B_t$ violates adaptation and breaks the supermartingale property.

References

Canonical:

Ramdas, A., Grünwald, P., Vovk, V., and Shafer, G. (2023). "Game-theoretic statistics and safe anytime-valid inference." Statistical Science 38(4), pp. 576-601. The modern survey, with Section 3 on e-processes and Section 4 on construction methods.
Shafer, G. and Vovk, V. (2019). Game-Theoretic Foundations for Probability and Finance (Wiley). Chapters 3-7 develop the betting/martingale framework that e-processes formalize.
Howard, S. R., Ramdas, A., McAuliffe, J., and Sekhon, J. (2021). "Time-uniform Chernoff bounds via nonnegative supermartingales." Probability Surveys 18, pp. 257-317. Quantitative time-uniform concentration inequalities derived from e-processes; the reference for empirical-Bernstein constructions.

Historical:

Wald, A. (1945). "Sequential tests of statistical hypotheses." Annals of Mathematical Statistics 16(2), pp. 117-186. The SPRT.
Wald, A. (1947). Sequential Analysis (Wiley). Book-length treatment of the SPRT and its operating characteristics.
Ville, J. (1939). Étude critique de la notion de collectif (Gauthier-Villars). The original supermartingale maximal inequality, in a foundations-of-probability context.

Current:

Waudby-Smith, I. and Ramdas, A. (2024). "Estimating means of bounded random variables by betting." Journal of the Royal Statistical Society, Series B 86(1), pp. 1-27. The state-of-the-art betting construction for confidence sequences.
Grünwald, P., de Heide, R., and Koolen, W. (2024). "Safe testing." Journal of the Royal Statistical Society, Series B. e-process construction via reverse information projection for composite nulls.

Next Topics

Anytime-valid inference: the broader framing of inference under continuous monitoring.
Confidence sequences: time-uniform interval estimates derived from e-processes.
Safe testing: the formal framework that builds tests directly on e-processes.
E-values and anytime-valid inference: the umbrella reference with proofs and multiple-testing applications.
Martingale theory: the mathematical foundation underneath the supermartingale construction.

Last reviewed: May 13, 2026

Canonical graph

Required before and derived from this topic

These links come from prerequisite edges in the curriculum graph. Editorial suggestions are shown here only when the target page also cites this page as a prerequisite.

Full prerequisite chain All derived topics

Required prerequisites

6

Measure-Theoretic Probabilitylayer 0B · tier 1
Modes of Convergence of Random Variableslayer 0B · tier 1
e-valueslayer 2 · tier 1
Likelihood-Ratio, Wald, and Score Testslayer 2 · tier 1
Martingale Theorylayer 0B · tier 2

Derived topics

3

Confidence Sequenceslayer 2 · tier 1
Anytime-Valid Inferencelayer 3 · tier 1
Safe Testinglayer 3 · tier 1

Graph-backed continuations

Anytime-Valid Inference Confidence Sequences Safe Testing