Beta. Content is under active construction and has not been peer-reviewed. Report errors on GitHub.Disclaimer

Statistical Estimation

Basu's Theorem

A complete sufficient statistic is independent of every ancillary statistic. This provides the cleanest method for proving independence between statistics without computing joint distributions.

AdvancedTier 3Stable~35 min
0

Why This Matters

Proving that two statistics are independent usually requires computing their joint distribution and factoring it. This can be painful. Basu's theorem gives a shortcut: if one statistic is complete sufficient and the other is ancillary, they are independent. No joint distribution computation needed.

The classic application: in a normal sample, the sample mean Xˉ\bar{X} is independent of the sample variance S2S^2. This fact underpins the derivation of the t-test. Basu's theorem proves it in two lines.

Formal Setup

Definition

Ancillary Statistic

A statistic A(X1,,Xn)A(X_1, \ldots, X_n) is ancillary for a parameter θ\theta if its distribution does not depend on θ\theta. It carries no information about θ\theta by itself, but may carry information in combination with other statistics.

Definition

Complete Statistic

A statistic T(X1,,Xn)T(X_1, \ldots, X_n) is complete if for every measurable function gg:

Eθ[g(T)]=0 for all θ    Pθ(g(T)=0)=1 for all θ\mathbb{E}_\theta[g(T)] = 0 \text{ for all } \theta \implies P_\theta(g(T) = 0) = 1 \text{ for all } \theta

Completeness means there are no nontrivial unbiased estimators of zero based on TT. Informally, TT contains no "wasted" information.

Main Theorems

Theorem

Basu's Theorem

Statement

If TT is a complete sufficient statistic for θ\theta and AA is ancillary for θ\theta, then TT and AA are independent (under every PθP_\theta).

Intuition

Sufficiency means that the conditional distribution of the data given TT does not depend on θ\theta. Ancillarity means the marginal distribution of AA does not depend on θ\theta. Completeness forces these two facts to combine into independence: the conditional distribution of AA given TT must equal the marginal distribution of AA.

Proof Sketch

Let BB be any measurable set. Define g(t)=P(ABT=t)P(AB)g(t) = P(A \in B \mid T = t) - P(A \in B). By sufficiency, P(ABT=t)P(A \in B \mid T = t) does not depend on θ\theta. By ancillarity, P(AB)P(A \in B) does not depend on θ\theta. So g(t)g(t) does not depend on θ\theta, and Eθ[g(T)]=Pθ(AB)P(AB)=0\mathbb{E}_\theta[g(T)] = P_\theta(A \in B) - P(A \in B) = 0 for all θ\theta (using ancillarity again). By completeness, g(T)=0g(T) = 0 a.s., meaning P(ABT)=P(AB)P(A \in B \mid T) = P(A \in B) a.s. This is independence.

Why It Matters

Without this theorem, proving independence of Xˉ\bar{X} and S2S^2 in normal sampling requires computing the joint density via a change of variables. With Basu's theorem, you only need three facts: (1) Xˉ\bar{X} is complete sufficient for μ\mu when σ2\sigma^2 is known, (2) S2/σ2S^2/\sigma^2 is ancillary for μ\mu, (3) apply the theorem. This pattern extends to many other settings.

Failure Mode

If the sufficient statistic is not complete, the theorem fails. For example, in a uniform distribution on [θ1,θ+1][\theta - 1, \theta + 1], the order statistics (X(1),X(n))(X_{(1)}, X_{(n)}) are sufficient but not complete. The range X(n)X(1)X_{(n)} - X_{(1)} is ancillary but not independent of the midrange (X(1)+X(n))/2(X_{(1)} + X_{(n)})/2.

Canonical Examples

Example

Normal sampling: mean and variance independence

Let X1,,XnN(μ,σ2)X_1, \ldots, X_n \sim N(\mu, \sigma^2) with σ2\sigma^2 known. The sample mean Xˉ\bar{X} is complete sufficient for μ\mu (this follows from the normal distribution being an exponential family). The statistic S2=1n1i=1n(XiXˉ)2S^2 = \frac{1}{n-1}\sum_{i=1}^n (X_i - \bar{X})^2 has a distribution that depends only on σ2\sigma^2, not on μ\mu. So S2S^2 is ancillary for μ\mu. By Basu's theorem, Xˉ\bar{X} and S2S^2 are independent. This is the fact that makes the t-statistic (Xˉμ)/(S/n)(\bar{X} - \mu)/(S/\sqrt{n}) have a t-distribution.

Example

Exponential distribution: mean and coefficient of variation

Let X1,,XnExp(λ)X_1, \ldots, X_n \sim \text{Exp}(\lambda). The sample sum T=XiT = \sum X_i is complete sufficient for λ\lambda. The vector of ratios (X1/T,,Xn/T)(X_1/T, \ldots, X_n/T) is ancillary (its distribution is uniform on the simplex, independent of λ\lambda). By Basu's theorem, TT is independent of all the ratios Xi/TX_i/T.

Common Confusions

Watch Out

Ancillary does not mean useless

An ancillary statistic carries no information about θ\theta by itself. But conditionally, given the ancillary, the precision of estimation can change. This is the basis of conditional inference. Basu's theorem says: if you have a complete sufficient statistic, you cannot improve estimation by conditioning on the ancillary.

Watch Out

Completeness is doing the heavy lifting

Sufficiency alone does not imply independence from ancillary statistics. Completeness is the key condition. Think of completeness as saying the sufficient statistic has no redundancy: there is no function of TT that is itself ancillary.

Summary

  • Complete sufficient + ancillary implies independent
  • The proof uses completeness to upgrade "same expectation" to "equal a.s."
  • The main application is proving independence without computing joint distributions
  • Fails without completeness: sufficiency alone is not enough

Exercises

ExerciseCore

Problem

Let X1,,XnN(μ,1)X_1, \ldots, X_n \sim N(\mu, 1). Identify a complete sufficient statistic and an ancillary statistic. State what Basu's theorem tells you.

ExerciseAdvanced

Problem

Give an example where a sufficient statistic TT and an ancillary statistic AA are not independent. What condition of Basu's theorem fails?

References

Canonical:

  • Casella & Berger, Statistical Inference, Chapter 6.2
  • Lehmann & Casella, Theory of Point Estimation, Chapter 4

Current:

Next Topics

Last reviewed: April 2026

Prerequisites

Foundations this topic depends on.

Next Topics