What Each Answers
Both results are about the sample mean of i.i.d. random variables, but they answer different questions.
Law of Large Numbers. Does converge to ? The LLN says yes, under finite-mean i.i.d. assumptions. It is a zero-th-order result: it identifies the limit, nothing more.
Central Limit Theorem. What is the limiting shape of the fluctuations as ? Under additional finite-variance assumptions, the CLT says they are Gaussian after rescaling by . It is a first-order result: it identifies the limit and the order of fluctuations around it.
Together they form the two-step asymptotic picture: LLN says where the estimator is going, CLT says how fast it gets there and what the remaining randomness looks like.
Side by Side
| Aspect | LLN | CLT |
|---|---|---|
| Statement | ||
| Convergence mode | In probability (WLLN) or almost sure (SLLN) | In distribution |
| Minimal assumption (i.i.d.) | ||
| Scaling | None (the sample mean itself) | |
| Information returned | The limit | The limit + the fluctuation shape + the rate |
| Confidence-interval use | Justifies consistency | Constructs the interval |
| Fails for | Cauchy, Pareto | Pareto , all -stable laws with |
| Replacement when assumption fails | None: estimator inconsistent | Generalized CLT: -stable limit |
The LLN-vs-CLT distinction is not "weak result vs. strong result". They require different assumptions and produce different kinds of information. A distribution can satisfy the LLN but fail the CLT (Pareto, ); in that case the sample mean still converges to the right limit, but not at the Gaussian rate.
The Practical Question: Which Do I Use?
Most undergraduate statistics blurs the two. The blur is harmless when the underlying distribution has both finite mean and finite variance. For more general work the distinction matters.
Use the LLN when:
- The question is "is my estimator consistent?" or "does the sample mean converge to something?"
- You are constructing a Monte Carlo estimator and want a correctness guarantee.
- You are proving consistency of an estimator like maximum likelihood or empirical risk minimization, where the first-order statement is enough.
Use the CLT when:
- The question is "how confident am I that my estimate is close to the truth?", and the answer needs a confidence interval.
- You need a finite- rate of , e.g. for empirical risk fluctuations around population risk.
- You are calibrating a hypothesis test that uses a Gaussian critical value.
- You are reasoning about the asymptotic normality of an MLE or M-estimator.
A Worked Contrast
Bernoulli sampling. Toss a coin with bias for trials, observe .
LLN says: almost surely. The fraction of heads settles at the bias.
CLT says: . The fluctuations are Gaussian with explicit variance.
LLN alone tells you nothing about how many tosses you need to verify vs . CLT plus the Berry-Esseen bound tells you: the standard error of is , and the Gaussian approximation is good to within in sup-distance, so for you need to see to declare a significant difference from a fair coin at the 5% level.
A confidence interval needs the CLT. A consistency proof needs the LLN. They are different jobs.
The Heavy-Tail Wedge
The two thresholds (finite mean for LLN, finite variance for CLT) define three regimes for i.i.d. samples. The middle one is where the two theorems split.
| Distribution tail | LLN | CLT | Practical answer |
|---|---|---|---|
| Finite variance (Gaussian, bounded, Pareto ) | Holds | Holds | Both apply: sample mean converges Gaussian-fast. |
| Finite mean but infinite variance (Pareto ) | Holds | Fails | Sample mean still converges to the right value, but at rate not , and the limit law is -stable, not Gaussian. See LLN-and-CLT-failures-under-heavy-tails. |
| Infinite mean (Cauchy, Pareto ) | Fails | Fails | Sample mean is inconsistent. Use the median or a trimmed mean. |
The middle regime is the one most often overlooked in introductory courses, and it is where many real applications (financial returns, insurance losses, word frequencies) sit.
Common Confusions
"The CLT proves the LLN." No. The CLT requires finite variance; the LLN does not. The LLN under only finite mean is a strictly weaker statement that the CLT cannot derive directly. If you have a sequence with infinite variance but finite mean, the LLN applies and the CLT does not.
"The LLN is just the n = ∞ limit of the CLT." The LLN says the sample mean has a degenerate limit at . The CLT says the fluctuations around that limit (after rescaling) have a non-degenerate Gaussian shape. These are different objects, not two views of the same thing.
"For large enough , the sample mean is approximately Gaussian." Only if the CLT applies. If , no is large enough; the sample mean has a non-Gaussian limit law, and confidence intervals built on the Gaussian approximation are wrong by an asymptotically growing factor.
Quick Decision Rule
Want a limit? Use LLN. Want a rate or a confidence interval? Use CLT. Heavy tails? Check which regime you are in, then use the appropriate theorem (or its replacement for the failure case).
References
Canonical:
- Durrett, Probability: Theory and Examples (5th ed., 2019), Chapter 2 (LLN) and Chapter 3 (CLT).
- Billingsley, Probability and Measure (3rd ed., 1995), Sections 6 and 27.
Current:
- van der Vaart, Asymptotic Statistics (1998), Chapter 2 (LLN and CLT in the context of statistical estimation).
- Wainwright, High-Dimensional Statistics: A Non-Asymptotic Viewpoint (2019), Chapter 2 (finite-sample concentration as the non-asymptotic counterpart to CLT).