Bootstrap Resampling
Bootstrap Methods
The nonparametric bootstrap: resample with replacement to approximate sampling distributions, construct confidence intervals, and quantify uncertainty without distributional assumptions.
Prerequisites
Why This Matters
Efron's bootstrap (1979) is one of the defining statistical ideas of the late 20th century. Before it, constructing confidence intervals or estimating standard errors required either closed-form formulas (available only for simple estimators) or asymptotic approximations that could be poor in finite samples.
The bootstrap gives you a general-purpose machine: have any estimator, want a confidence interval? Resample. Want a standard error? Resample. Want a bias correction? Resample. It works for medians, correlations, regression coefficients, eigenvalues. virtually any statistic you can compute.
Mental Model
You have one sample of size from an unknown distribution . You want to know how your estimator would vary if you could draw many samples from . But you only have one sample. The bootstrap says: treat your sample as if it were the population, and resample from it. The variability of the resampled estimates approximates the true sampling variability.
This sounds like cheating. It is not. It works because the empirical distribution converges to , so resampling from approximates sampling from .
Formal Setup
Empirical Distribution Function
Given observations drawn i.i.d. from an unknown distribution , the empirical distribution function is:
This places mass on each observed data point.
Bootstrap Sample
A bootstrap sample is a sample of size drawn i.i.d. from . Equivalently, draw observations with replacement from the original sample .
Each bootstrap sample will typically contain some original observations multiple times and omit others entirely. On average, about of the original observations appear at least once.
Bootstrap Distribution
Let be any statistic. The bootstrap distribution of is the distribution of induced by resampling. In practice, you approximate it by generating bootstrap samples and computing .
The Plug-in Principle
The bootstrap is an instance of the plug-in principle: to estimate a functional of the unknown distribution, compute . To estimate the sampling distribution of , replace with everywhere.
The true sampling variance of is:
The bootstrap estimate replaces with :
In practice, approximate this with:
Main Theorems
Bootstrap Consistency
Statement
Under regularity conditions, the bootstrap distribution of converges to the same limit as . More precisely, if , then conditional on the data, with probability 1:
where denotes probability under bootstrap resampling.
Intuition
The bootstrap works because the empirical distribution converges to the true distribution (Glivenko-Cantelli), so resampling from mimics sampling from . The bootstrap distribution of the centered, scaled statistic converges to the same Gaussian limit as the original statistic.
Proof Sketch
The key steps are:
- Show that the bootstrap mean satisfies a conditional CLT: in probability.
- Extend to smooth functionals via the functional delta method: if for differentiable , then the bootstrap distribution of inherits consistency.
- The Glivenko-Cantelli theorem ensures uniformly, which drives the convergence of the bootstrap distribution.
Why It Matters
Bootstrap consistency justifies using the bootstrap distribution for inference. It means bootstrap confidence intervals have asymptotically correct coverage, and bootstrap standard errors are asymptotically correct. all without knowing the form of or deriving analytic formulas.
Failure Mode
The bootstrap fails when the statistic is not sufficiently "smooth" as a functional of . Classic failure: the bootstrap is inconsistent for the maximum of a uniform distribution, because the empirical distribution has atoms but the true distribution is continuous. More generally, non-differentiable functionals and heavy-tailed distributions can cause bootstrap failure.
Bootstrap Confidence Intervals
There are several ways to turn the bootstrap distribution into a confidence interval. They differ in accuracy and computational cost.
Percentile Method
The simplest approach. For a confidence interval:
where is the -th quantile of the bootstrap distribution.
This is intuitive but can be inaccurate when the bootstrap distribution is skewed or when is biased.
Pivotal (Basic) Bootstrap
Uses the bootstrap to estimate the distribution of :
Note the reversal of quantiles. This corrects for bias and is more accurate than the percentile method for skewed distributions.
BCa (Bias-Corrected and Accelerated)
The gold standard for bootstrap confidence intervals. It adjusts for both bias and skewness:
where are adjusted quantile levels that depend on a bias correction factor and an acceleration factor . The acceleration is typically estimated via the jackknife. BCa intervals have second-order accuracy: coverage error is rather than the of the percentile method.
Variants
Parametric Bootstrap
Instead of resampling from , fit a parametric model and resample from it. For example, if you assume data are normal, estimate and generate bootstrap samples from .
Advantage: more efficient when the model is correct. Disadvantage: invalid when the model is wrong.
Wild Bootstrap
For regression with heteroscedastic errors, the standard bootstrap (resampling residuals) fails because it destroys the heteroscedasticity pattern. The wild bootstrap fixes residuals to their original positions and multiplies each by a random variable with mean 0 and variance 1. Common choices: Rademacher ( with equal probability) or the two-point distribution of Mammen (1993).
Block Bootstrap
For time series data, i.i.d. resampling destroys temporal dependence. The block bootstrap resamples blocks of consecutive observations. Variants include the moving block bootstrap (fixed block length), the stationary bootstrap (random block length with geometric distribution), and the circular block bootstrap.
Canonical Examples
Bootstrap standard error of the median
The median has no simple formula for its standard error (unlike the mean, where ). The bootstrap handles it effortlessly:
- From a sample , compute the sample median .
- Generate bootstrap samples, compute the median of each: .
- The bootstrap standard error is .
For a sample of size from a standard normal, the true standard error of the median is approximately . The bootstrap estimate will be close.
Bootstrap for correlation coefficients
Given paired data for , the sample correlation has a complicated sampling distribution (especially when the true is not zero). The bootstrap gives you the distribution for free: resample pairs with replacement, compute for each bootstrap sample, and use the resulting distribution for inference.
Common Confusions
The bootstrap does not create new information
A common misconception is that bootstrap "generates new data." It does not. It uses the existing data to approximate the sampling distribution. The quality of the approximation depends on how well the empirical distribution approximates the true distribution. With observations, the bootstrap may be unreliable because is a poor approximation to .
More bootstrap samples B does not fix small n
Increasing (the number of bootstrap replications) reduces Monte Carlo error. The error from approximating the bootstrap distribution with a finite simulation. But it does not reduce the fundamental statistical error from having a small original sample . Even with bootstrap samples, if , the bootstrap distribution is built from only 10 distinct values.
The bootstrap can fail
The bootstrap is not universally valid. It fails for:
- Extreme order statistics (e.g., the sample maximum from a bounded distribution). The bootstrap distribution does not converge to the true sampling distribution.
- Heavy-tailed distributions without finite variance: the CLT does not apply, so the bootstrap CLT also fails.
- Non-smooth functionals: if is not a smooth functional of , the plug-in principle can break.
Summary
- The bootstrap approximates the sampling distribution by resampling with replacement from the observed data
- It works because (Glivenko-Cantelli), so resampling from mimics sampling from
- Bootstrap confidence intervals: percentile (simplest), pivotal (better for skewed distributions), BCa (gold standard, second-order accurate)
- Parametric bootstrap: resample from fitted model (more efficient if model is correct)
- Wild bootstrap: for heteroscedastic regression; block bootstrap: for time series
- Bootstrap fails for non-smooth functionals, extreme order statistics, and heavy-tailed distributions
Exercises
Problem
You have a sample of observations: . You want a 95% bootstrap confidence interval for the population median using the percentile method.
Describe the algorithm step by step. Then: if your bootstrap medians are sorted as , which order statistics give you the interval endpoints?
Problem
Explain why the nonparametric bootstrap is inconsistent for the sample maximum when sampling from a distribution. What is the correct rate of convergence for , and why does the bootstrap get it wrong?
References
Canonical:
- Efron, B. "Bootstrap Methods: Another Look at the Jackknife" (1979)
- Efron, B. & Tibshirani, R. An Introduction to the Bootstrap (1993)
Current:
- Davison, A.C. & Hinkley, D.V. Bootstrap Methods and their Application (1997)
- Hall, P. The Bootstrap and Edgeworth Expansion (1992)
Next Topics
The natural next steps from bootstrap methods:
- Bootstrap theory: when and why the bootstrap is consistent, Edgeworth expansions and higher-order accuracy
- Hypothesis testing with the bootstrap: permutation tests, bootstrap p-values, and the connection to resampling-based inference
Last reviewed: April 2026
Prerequisites
Foundations this topic depends on.
- Common Probability DistributionsLayer 0A
- Sets, Functions, and RelationsLayer 0A
- Basic Logic and Proof TechniquesLayer 0A
Builds on This
- BaggingLayer 2
- Random ForestsLayer 2