Statistical Estimation
Poisson Limit Theorem and Le Cam's Bound
Bin(n, lambda/n) converges to Pois(lambda) as n grows. The classical product-of-PMFs proof, then Le Cam's total-variation bound that makes the approximation quantitative. When to use Poisson vs Normal approximation. Disambiguation: Le Cam published multiple famous theorems.
Why This Matters
The Poisson distribution is what you get when many independent things each have a small chance of happening, and you count how many actually do. Rare typos in a long document, photons hitting a detector in a fixed time window, insurance claims in a year, defects on a manufactured wafer, mutations in a stretch of DNA, requests arriving at a web server. All of these have the same mathematical structure: a sum of many Bernoulli trials with small success probabilities.
The Poisson limit theorem makes the connection precise. As the number of trials grows and the per-trial probability shrinks so that the expected count stays finite, the binomial distribution converges to Poisson. The result is older than the modern CLT and predates the formal probability axioms; Poisson published the special case in 1837.
Le Cam's 1960 sharpening gives a quantitative version: the total-variation distance between and is at most . This is sharper than the standard Berry-Esseen rate for the normal approximation in the rare-event regime and explains why practitioners reach for Poisson rather than Normal when is small.
Quick Version
| Object | Approximation |
|---|---|
| with fixed, | |
| Finite- error (Le Cam, 1960) | |
| Sum of non-identical Bernoullis | |
| Rule of thumb | , |
The non-identical case is the form most worth remembering, since it covers Bernoulli trials with different success probabilities (different risk classes, different exposure levels). Le Cam's bound is the cleanest finite-sample approximation result in elementary probability.
Statement
Poisson Limit Theorem
Statement
Let with as . Then converges in distribution to : The convergence is also in total-variation distance, not only weakly.
Intuition
The Poisson distribution is the law of rare events: many independent chances, each one tiny, with a finite expected count. The binomial PMF pulls toward when is small and is large, because and . The Poisson PMF emerges from the product.
Why It Matters
The limit explains why the same Poisson distribution appears in radioactive decay, queue arrivals, mutation counts, and insurance claims. None of these systems involve an integer parameter of trials in any visible sense, yet they all sit at the Poisson endpoint of the binomial family. The Poisson is the universal limit law for sums of rare independent events.
Practically, the result lets you replace with , which has no dependence inside the combinatorial coefficient. For and , the binomial is computationally inconvenient; the Poisson approximation is one line.
Failure Mode
The approximation degrades when is not small. At the Le Cam bound gives , which grows with rather than shrinking. For moderate the Normal approximation (De Moivre-Laplace) is the right tool instead. The Poisson limit also fails when the underlying trials are dependent: a sum of dependent Bernoullis with the same marginals can have a different limit law, and quantifying the dependence requires the Stein-Chen method or a coupling argument.
Optional ProofClassical product-of-PMFs proofShow
Fix and let , so exactly. The binomial PMF is
Group the factors:
Factor (A): a product of terms each of the form , all converging to . So (A) .
Factor (B): a defining limit, .
Factor (C): converges to since is fixed and .
Multiplying: , which is the Poisson PMF. The same argument with (not exactly ) goes through with negligible adjustments because controls the rate of all three factors.
Le Cam's Total-Variation Bound
The convergence statement above is qualitative. Le Cam (1960) proved a quantitative form that bounds the approximation error at finite .
Le Cam Total-Variation Bound
Statement
Let be independent with , and let . Then for , In the homogeneous case for all , this specializes to .
Intuition
The total-variation distance between two distributions and is , the worst-case difference in probabilities across all events. Le Cam's bound says the binomial and Poisson assign nearly the same probability to every event when the success probabilities are all small, with explicit error proportional to the sum of squared probabilities. Squaring is the right scaling because the first-moment match is exact; the error is driven by second-moment mismatch.
Why It Matters
The bound is non-asymptotic: it holds at finite and gives an explicit error. This matters for insurance and reliability applications where is concrete and varies across risk classes. It also matters in theoretical computer science (sums of rare events in randomized algorithms) and in epidemiology (counting cases across heterogeneous populations). The classical limit theorem says "convergence happens"; Le Cam's bound says "and here is how close you already are at ".
Failure Mode
The bound is tight in the rare-event regime small. When are not small the bound is useless: at , , the bound gives , far above the maximum possible total variation of . The useful regime is roughly , where the bound certifies non-trivial approximation. The bound also requires independence; with dependent Bernoullis the right tool is the Chen-Stein method, which extends the bound by adding a coupling-error term.
Optional ProofCoupling proof of Le Cam's boundShow
The slickest proof constructs and on the same probability space so they agree as often as possible.
For each , build a coupling with and such that (the geometric inequality gives the bound after a case-by-case construction; the cleanest version yields exactly as the per-trial bound after simplification, see Lindvall §1).
Let and . The Poisson is closed under convolution of independents, so . By the coupling inequality:
The first inequality is the standard coupling inequality (the TV distance is the infimum over couplings of ). The second is a union bound. The third is the per-trial coupling estimate. This is one of the classical applications of the Stein-Chen method in its coupling form; see Barbour, Holst, and Janson (1992) for the full development.
Le Cam Theorem Disambiguation
Lucien Le Cam (1924-2000) published several theorems that bear his name, and conflating them is a real source of confusion in the literature. The result on this page is the Poisson approximation theorem (Le Cam, 1960). The other major Le Cam results worth knowing about:
- Le Cam's first lemma and third lemma on contiguity of probability measures (Le Cam, 1960). Used in asymptotic statistics for local asymptotic normality (LAN). Unrelated to Poisson approximation.
- Le Cam's bound on minimax risk (Le Cam, 1973). A general technique for lower-bounding statistical estimation risk via two-point or multi-point reductions. Unrelated to either of the above.
- Le Cam's theorem on quadratic mean differentiability in asymptotic statistics. Used in establishing LAN and efficient-estimator theory.
When a textbook says "Le Cam's theorem" without context, check whether the statement involves total-variation distance (this page), local likelihood ratios (contiguity), or estimation lower bounds (minimax). Three different beasts.
When to Use Poisson vs Normal Approximation
The general rule: use Poisson when is small and is moderate (rare events). Use Normal (De Moivre-Laplace) when is large and is not extreme. Both approximations agree in the intermediate regime , .
| Regime | Poisson error (Le Cam) | Normal error (Berry-Esseen) | Recommended |
|---|---|---|---|
| , | Poisson | ||
| , | Normal (with continuity correction) | ||
| , | (useless) | Normal | |
| , | (useless) | Exact binomial |
The Le Cam bound gives a finite- certificate. The Berry-Esseen bound likewise gives a finite- certificate for the Normal approximation. Comparing the two tells you which approximation is currently better.
Common Confusions
Poisson is not Normal with small mean
Pois() is not the same as even though they share the same mean and variance. Poisson is supported on non-negative integers and is right-skewed for small . At , the Poisson skewness is , far from Gaussian. For the Poisson is close enough to Gaussian that you can use Normal approximation on a Poisson, but for small you cannot.
The Poisson process is not the Poisson distribution
Poisson distribution: the count of rare events in a fixed-size window. Poisson process: a random collection of points in time (or space) with the property that counts on disjoint windows are independent Poissons. The distribution is one marginal of the process. Confusing the two leads to errors when independence across windows matters (e.g., reasoning about inter-arrival times).
Le Cam's bound requires independence
The Le Cam bound is for independent Bernoulli summands. For dependent Bernoullis the bound fails: perfectly correlated Bernoullis sum to either or , never anywhere in between, and no Poisson approximation makes sense. The Stein-Chen method handles weakly dependent variables with an extra term that measures the dependence.
Exercises
Problem
A web server receives requests per day, each one of which triggers a rare bug with probability independently. Approximate the probability that the bug is triggered at least 3 times, and bound the error of the approximation using Le Cam's inequality.
Problem
For a sum of independent Bernoullis with probabilities for , compute the Le Cam total-variation upper bound on the distance to the matching Poisson. Compare to the homogeneous case with the average.
References
Canonical:
- Feller, An Introduction to Probability Theory and Its Applications, Vol I (3rd ed., 1968), Chapter VI (Poisson limit and the law of small numbers).
- Barbour, Holst, and Janson, Poisson Approximation (1992), Chapter 1 (the Stein-Chen method, modern derivation of Le Cam's bound).
- Le Cam, "An approximation theorem for the Poisson binomial distribution", Pacific Journal of Mathematics 10 (1960), pp. 1181-1197.
Current:
- Blitzstein and Hwang, Introduction to Probability (2nd ed., 2019), Chapter 4 (Poisson distribution and Poisson approximation, applied perspective).
- Lindvall, Lectures on the Coupling Method (1992; Dover ed. 2002), Section 1 (coupling proof of the Le Cam bound).
Next Topics
- De Moivre-Laplace Theorem — the other limit law for the binomial, in the complementary regime ( moderate).
- Central Limit Theorem — the universal sum-of-independent-summands result.
- Characteristic Functions — the standard tool for proving convergence-in-distribution results.
Last reviewed: May 12, 2026
Canonical graph
Required before and derived from this topic
These links come from prerequisite edges in the curriculum graph. Editorial suggestions are shown here only when the target page also cites this page as a prerequisite.
Required prerequisites
3- Common Probability Distributionslayer 0A · tier 1
- Characteristic Functionslayer 1 · tier 1
- Moment Generating Functionslayer 0A · tier 2
Derived topics
0No published topic currently declares this as a prerequisite.