Skip to main content

Statistical Estimation

Student-t Distribution and t-Test

The Student-t distribution as a ratio of a standard Normal and a root-Chi-squared, and the one-sample, two-sample, and paired t-tests it powers: exact null distribution under Normality, Welch correction for unequal variances, and large-sample equivalence to the Wald z-test.

EssentialCoreTier 1StableCore spine~60 min
For:StatsActuarialGeneral

Why This Matters

The Student-t distribution is the exact sampling distribution of the standardized sample mean from an i.i.d. Normal sample with unknown variance. That single fact powers the most-used parametric test in statistics: the t-test. Three flavors of the t-test all rest on the same exact sampling-distribution result, with different choices of numerator and denominator:

  1. One-sample t-test. Compares a sample mean to a hypothesized value. The statistic is (Xˉnμ0)/(S/n)(\bar X_n - \mu_0)/(S/\sqrt n), exactly tn1t_{n-1} under Normality and the null.
  2. Two-sample t-test, equal variances. Compares two independent sample means with a pooled variance estimate. The statistic is (XˉYˉ)/(Sp1/nX+1/nY)(\bar X - \bar Y)/(S_p\sqrt{1/n_X + 1/n_Y}), exactly tnX+nY2t_{n_X + n_Y - 2} under common-variance Normality and the null.
  3. Welch's t-test. Compares two independent sample means without the equal-variance assumption. The statistic is approximately tνt_\nu with Welch-Satterthwaite degrees of freedom ν\nu.
  4. Paired t-test. Reduces a paired comparison to a one-sample t-test on the differences.

All four are exact (or near-exact) under Normality. The large-sample behavior is the Wald z-test: as degrees of freedom grow, tνN(0,1)t_\nu\to\mathcal{N}(0,1), and the t-test merges into the asymptotic normal test that the central limit theorem produces.

The Student-t Distribution

Definition

Student-t Distribution

A random variable XX has a Student-t distribution with ν>0\nu > 0 degrees of freedom if its density is

fX(x)=Γ((ν+1)/2)νπΓ(ν/2)(1+x2ν)(ν+1)/2,xR.f_X(x) = \frac{\Gamma((\nu+1)/2)}{\sqrt{\nu\pi}\,\Gamma(\nu/2)}\left(1+\frac{x^2}{\nu}\right)^{-(\nu+1)/2},\qquad x\in\mathbb{R}.

The distribution is symmetric about zero. For ν>1\nu > 1 the mean is zero. For ν>2\nu > 2 the variance is ν/(ν2)\nu/(\nu - 2). The MGF is infinite for every nonzero ss; the distribution is heavy-tailed with polynomial tail decay of order x(ν+1)x^{-(\nu+1)}.

The parameter ν\nu controls tail weight. Small ν\nu gives very heavy tails (Cauchy at ν=1\nu = 1, infinite variance for ν2\nu\le 2). As ν\nu\to\infty the Student-t converges to the standard Normal; for ν>30\nu > 30 the two are nearly indistinguishable in the body of the distribution.

Student-t Construction

Theorem

Student-t as Ratio of Normal and Root Chi-squared

Statement

Let ZN(0,1)Z\sim\mathcal{N}(0,1) and Vχν2V\sim\chi^2_\nu be independent. Then T=ZV/νtν.T = \frac{Z}{\sqrt{V/\nu}}\sim t_\nu.

Intuition

ZZ is the source of unit-variance Normal noise. V/νV/\nu is an empirical estimate of unit variance (since E[V]=ν\mathbb{E}[V] = \nu); it converges to 11 as ν\nu\to\infty. Dividing ZZ by a noisy estimate of the scale inflates the tails by a polynomial amount. The thicker the noise (smaller ν\nu), the heavier the tails of TT.

Proof Sketch

Joint density of (Z,V)(Z, V) factors by independence: fZ,V(z,v)=φ(z)fχν2(v)f_{Z,V}(z, v) = \varphi(z)\cdot f_{\chi^2_\nu}(v). Change variables to (T,V)=(Z/V/ν,V)(T, V) = (Z/\sqrt{V/\nu}, V) with z=tv/νz = t\sqrt{v/\nu} and Jacobian v/ν\sqrt{v/\nu}. After substitution, the joint density of (T,V)(T, V) is proportional to vν/21/2ev(1+t2/ν)/2v/νv^{\nu/2 - 1/2}e^{-v(1+t^2/\nu)/2}\sqrt{v/\nu}. Integrating over v>0v > 0 uses the Gamma normalizing constant and yields the stated density.

Why It Matters

The sample mean of an i.i.d. Normal sample has numerator n(Xˉμ)/σ\sqrt n(\bar X - \mu)/\sigma that is standard Normal, and denominator S/σS/\sigma where S2/σ2S^2/\sigma^2 is χn12/(n1)\chi^2_{n-1}/(n-1) (i.e., a Chi-squared divided by its degrees of freedom). Independence of Xˉ\bar X and S2S^2 for Normal samples (see normal distribution) is what makes the standardized statistic exactly tn1t_{n-1} rather than an arbitrary ratio.

Failure Mode

Independence of ZZ and VV is essential. In the t-test the relevant ZZ is n(Xˉμ)/σ\sqrt n(\bar X - \mu)/\sigma and the relevant VV is (n1)S2/σ2(n-1)S^2/\sigma^2; their independence is a consequence of Cochran's theorem applied to Normal samples. For non-Normal samples, Xˉ\bar X and S2S^2 are asymptotically uncorrelated but not independent, so the t-distribution is exact only under Normality. Outside Normality, the test is asymptotic, and its accuracy in moderate samples depends on tail weight and skewness.

One-Sample t-Test

Theorem

One-Sample t-Test

Statement

To test H0:μ=μ0H_0: \mu = \mu_0 against H1:μμ0H_1: \mu\ne\mu_0, use T=Xˉnμ0S/n,S2=1n1i=1n(XiXˉn)2.T = \frac{\bar X_n - \mu_0}{S/\sqrt n},\qquad S^2 = \frac{1}{n-1}\sum_{i=1}^n(X_i - \bar X_n)^2. Under H0H_0, Ttn1T\sim t_{n-1} exactly. The two-sided test rejects at level α\alpha when T>tn1,1α/2|T| > t_{n-1,1-\alpha/2}, the (1α/2)(1-\alpha/2) quantile of tn1t_{n-1}. The one-sided tests against μ>μ0\mu > \mu_0 or μ<μ0\mu < \mu_0 use the corresponding tail.

Intuition

TT is a standardized sample mean. Under the null, the numerator has standard error σ/n\sigma/\sqrt n, so n(Xˉμ0)/σ\sqrt n(\bar X - \mu_0)/\sigma is standard Normal. The denominator divides by the sample estimate SS instead of the true σ\sigma. The ratio is Normal divided by the root of a normalized Chi-squared, which is exactly Student-t with n1n - 1 degrees of freedom.

Proof Sketch

Under Normality, n(Xˉμ)/σN(0,1)\sqrt n(\bar X - \mu)/\sigma\sim\mathcal{N}(0,1) and (n1)S2/σ2χn12(n-1)S^2/\sigma^2\sim\chi^2_{n-1}, with the two independent (see normal distribution). The statistic is T=n(Xˉμ0)/σ(n1)S2/(σ2(n1))=ZV/(n1),T = \frac{\sqrt n(\bar X - \mu_0)/\sigma}{\sqrt{(n-1)S^2/(\sigma^2(n-1))}} = \frac{Z}{\sqrt{V/(n-1)}}, which is tn1t_{n-1} by the construction theorem.

Why It Matters

The one-sample t-test is the basic parametric test for a sample mean. It is the test you reach for when you want to know whether a sample mean differs from a fixed reference value. The 95% confidence interval Xˉn±tn1,0.975S/n\bar X_n\pm t_{n-1, 0.975}\cdot S/\sqrt n is the inverted test region. Both the test and the interval are exact under Normality and asymptotically valid (with size α\to\alpha and correct coverage) under any distribution with finite variance, by the central limit theorem combined with the asymptotic equivalence of tn1t_{n-1} and the standard Normal as nn\to\infty.

Failure Mode

The exact tn1t_{n-1} distribution requires Normal data. With heavy-tailed data, the t-statistic has heavier tails than tn1t_{n-1} predicts, and rejection rates are inflated above the nominal level. With skewed data and small samples, the test is biased in the direction of the longer tail. Permutation tests (see permutation tests) are the distribution-free alternative.

Two-Sample t-Test, Equal Variances

Theorem

Two-Sample t-Test with Common Variance

Statement

To test H0:μX=μYH_0: \mu_X = \mu_Y, define the pooled variance Sp2=(nX1)SX2+(nY1)SY2nX+nY2S_p^2 = \frac{(n_X - 1)S_X^2 + (n_Y - 1)S_Y^2}{n_X + n_Y - 2} and the statistic T=XˉYˉSp1/nX+1/nY.T = \frac{\bar X - \bar Y}{S_p\sqrt{1/n_X + 1/n_Y}}. Under H0H_0 and common variance, TtnX+nY2T\sim t_{n_X + n_Y - 2} exactly.

Intuition

The pooled variance Sp2S_p^2 averages the two sample variances, weighted by their degrees of freedom. Under the common-variance assumption, (nX+nY2)Sp2/σ2χnX+nY22(n_X + n_Y - 2)S_p^2/\sigma^2\sim\chi^2_{n_X+n_Y-2}, and the difference of sample means XˉYˉ\bar X - \bar Y is independent of Sp2S_p^2. The standardized difference is therefore Normal-over-root-Chi-squared with the pooled degrees of freedom.

Proof Sketch

Under common variance, XˉYˉN(μXμY,σ2(1/nX+1/nY))\bar X - \bar Y\sim\mathcal{N}(\mu_X - \mu_Y, \sigma^2(1/n_X + 1/n_Y)). The two sample variances SX2,SY2S_X^2, S_Y^2 scaled by σ2\sigma^2 are independent Chi-squareds with nX1n_X - 1 and nY1n_Y - 1 degrees of freedom. Their sum scaled by σ2\sigma^2 is χnX+nY22\chi^2_{n_X + n_Y - 2} by Chi-squared additivity. The numerator of TT standardized by σ1/nX+1/nY\sigma\sqrt{1/n_X + 1/n_Y} is standard Normal; the denominator is Sp/σS_p/\sigma, a root of a normalized Chi-squared. Independence of mean and variance for Normal samples extends to the pooled estimate.

Why It Matters

This is the canonical parametric test for "did treatment A change the mean compared to treatment B" under the simplifying assumption that the two groups share a common variance. It is the test the original Gosset paper introduced under the "Student" pseudonym (Biometrika, 1908). The same procedure gives a confidence interval for μXμY\mu_X - \mu_Y by inversion.

Failure Mode

The common-variance assumption matters. With unequal variances and unequal sample sizes, the pooled t-test has the wrong size: rejection rates can be much higher or lower than nominal, depending on which group has more variance and more data. The Welch t-test below is the right replacement and is the default in modern software.

Welch's t-Test

Theorem

Welch t-Test for Unequal Variances

Statement

The Welch statistic is TW=XˉYˉSX2/nX+SY2/nY,T_W = \frac{\bar X - \bar Y}{\sqrt{S_X^2/n_X + S_Y^2/n_Y}}, and is approximately tνt_\nu under the null with Welch-Satterthwaite degrees of freedom ν=(SX2/nX+SY2/nY)2(SX2/nX)2/(nX1)+(SY2/nY)2/(nY1).\nu = \frac{(S_X^2/n_X + S_Y^2/n_Y)^2}{(S_X^2/n_X)^2/(n_X-1) + (S_Y^2/n_Y)^2/(n_Y-1)}.

Intuition

The denominator uses each group's own variance estimate. The price for not pooling is that the denominator is not a multiple of a single Chi-squared, so the statistic is not exactly t-distributed. The Welch-Satterthwaite approximation matches the first two moments of the denominator squared to those of a scaled Chi-squared, and the resulting degrees of freedom is the moment-matched degrees of freedom. The approximation is accurate even at moderate sample sizes when the variance ratio is far from one.

Proof Sketch

The denominator squared SX2/nX+SY2/nYS_X^2/n_X + S_Y^2/n_Y is a linear combination of two independent scaled Chi-squareds, σX2/nXχnX12/(nX1)\sigma_X^2/n_X\cdot\chi^2_{n_X-1}/(n_X-1) and σY2/nYχnY12/(nY1)\sigma_Y^2/n_Y\cdot\chi^2_{n_Y-1}/(n_Y-1). Satterthwaite's approximation matches its first two moments to those of cχν2/νc\chi^2_\nu/\nu for some c,νc, \nu. The matching gives c=σX2/nX+σY2/nYc = \sigma_X^2/n_X + \sigma_Y^2/n_Y and the stated formula for ν\nu (with sample variances substituted for population variances).

Why It Matters

Welch's test is the default two-sample t-test in R (t.test(...) without var.equal = TRUE), SciPy (stats.ttest_ind(..., equal_var=False)), and most modern statistical software. Use it unless you have a specific reason to believe the variances are equal. The cost over the equal-variance pooled test is a fractional reduction in degrees of freedom, which is negligible at moderate sample sizes.

Failure Mode

Welch's test still assumes Normal data within each group, although the approximation degrades more gracefully under non-Normality than the exact pooled test. For heavy-tailed or strongly skewed data, prefer a permutation test or a rank-based test (Mann-Whitney). The Welch-Satterthwaite degrees of freedom can be a non-integer; software interpolates the Chi-squared CDF.

Paired t-Test

Theorem

Paired t-Test

Statement

To test H0:δ=0H_0: \delta = 0 for pairs (Xi,Yi)(X_i, Y_i) with Di=XiYiD_i = X_i - Y_i, compute T=DˉnSD/n,SD2=1n1i=1n(DiDˉn)2.T = \frac{\bar D_n}{S_D/\sqrt n},\qquad S_D^2 = \frac{1}{n-1}\sum_{i=1}^n(D_i - \bar D_n)^2. Under H0H_0 and Normality of the differences, Ttn1T\sim t_{n-1} exactly.

Intuition

A paired sample reduces to a one-sample t-test on the within-pair differences. The pairing eliminates between-subject variability and increases power compared to a two-sample test on the raw values, provided the pairs are genuinely linked (same subject before and after, matched pairs, twins).

Proof Sketch

The differences D1,,DnD_1,\dots,D_n are i.i.d. Normal under the assumption. Apply the one-sample t-test theorem to DiD_i with null mean zero.

Why It Matters

Pre-versus-post designs, twin studies, matched-pair clinical trials, and within-subject crossover trials all use the paired t-test. The power gain over a two-sample test is substantial when the within-pair correlation is high; the test exploits that correlation by subtracting out the shared subject-level baseline. Ignoring pairing and using a two-sample test on the raw values gives a correct level (the test is still valid) but lower power.

Failure Mode

The Normality assumption on the differences is what is required, not Normality of each marginal. Two heavily skewed marginals can have nearly Normal differences if the pairing captures the right shared structure; conversely, two Normal marginals can have non-Normal differences if the pairing is loose. Plot the differences before assuming Normality; if heavy-tailed, use a sign test or Wilcoxon signed-rank test.

When to Use Each Test

SettingTestReference distributionDegrees of freedom
One sample, fixed null meanOne-sample t-testtνt_\nun1n - 1
Two independent samples, equal variancesPooled two-sample t-testtνt_\nunX+nY2n_X + n_Y - 2
Two independent samples, unequal variancesWelch t-testtνt_\nu approximatelyWelch-Satterthwaite
Paired or matched samplesPaired t-test on differencestνt_\nun1n - 1
Two samples, large nn, no Normality assumptionWald z-testN(0,1)\mathcal{N}(0,1)none
Two samples, heavy-tailed or small nnPermutation testempiricalnone

The Wald z-test is the asymptotic version: replace the t-distribution with the standard Normal. For n>30n > 30 or ν>30\nu > 30, the t and z critical values agree to within about 5%; for ν>100\nu > 100 they agree to within 1%. See likelihood-ratio, Wald, and score tests.

Confidence Intervals

The one-sample t-test inverts to a (1α)(1-\alpha) confidence interval: Xˉn±tn1,1α/2Sn.\bar X_n \pm t_{n-1,1-\alpha/2}\cdot\frac{S}{\sqrt n}. The two-sample (pooled) and Welch intervals follow the same pattern: take the estimate plus or minus the t-quantile times the standard error. The standard error is Sp1/nX+1/nYS_p\sqrt{1/n_X + 1/n_Y} in the pooled case and SX2/nX+SY2/nY\sqrt{S_X^2/n_X + S_Y^2/n_Y} in the Welch case. The paired interval uses SD/nS_D/\sqrt n with n1n - 1 degrees of freedom.

Common Confusions

Watch Out

The t-distribution is heavy-tailed even at modest degrees of freedom

For ν=5\nu = 5, the t-distribution has visibly heavier tails than the Normal: the 97.5% quantile is 2.572.57 instead of 1.961.96. The "t is approximately Normal" intuition holds only for ν>30\nu > 30 or so. With small samples, use the t-quantile, not the Normal quantile, even when the underlying data look Normal.

Watch Out

Welch versus equal-variance: prefer Welch by default

The cost of using Welch when variances are actually equal is small (a slight loss of power, typically less than 1 percentage point). The cost of using the equal-variance test when variances are unequal can be large: actual rejection rates can be 10% or 20% under a nominal 5% test, depending on the variance ratio and sample-size imbalance.

Watch Out

A high p-value is not evidence of equal means

Failing to reject H0:μX=μYH_0: \mu_X = \mu_Y means the data are consistent with equal means. It does not mean the means are equal. The confidence interval is the right summary: a 95% interval centered near zero with narrow width indicates "we have ruled out large differences"; a wide interval indicates "we have ruled out very little".

Watch Out

The t-test assumes the population variance is unknown

If the population variance σ2\sigma^2 is known (rare in practice, common in textbook problems), the appropriate test is the z-test with σ\sigma in the denominator, not the t-test with SS. The z-test rejects when Z=nXˉμ0/σ>1.96|Z| = \sqrt n|\bar X - \mu_0|/\sigma > 1.96 at the 5% level. The t-test adds an extra source of variability (the sample variance), which is why its critical value is larger.

Exercises

ExerciseCore

Problem

A new manufacturing process produces parts with target length 50 mm. A sample of n=25n = 25 parts gives Xˉ=50.3\bar X = 50.3 mm and S=0.8S = 0.8 mm. Test H0:μ=50H_0:\mu = 50 against H1:μ50H_1:\mu\ne 50 at level 0.05.

ExerciseCore

Problem

Two independent samples have nX=12n_X = 12 with Xˉ=10.2\bar X = 10.2 and SX=1.5S_X = 1.5, and nY=10n_Y = 10 with Yˉ=9.1\bar Y = 9.1 and SY=2.0S_Y = 2.0. Use Welch's t-test to test H0:μX=μYH_0:\mu_X = \mu_Y at level 0.05.

ExerciseAdvanced

Problem

A paired study measures cholesterol level for 8 patients before and after a diet. The differences (after minus before, in mg/dL) are (12,8,15,3,10,7,20,5)(-12, -8, -15, 3, -10, -7, -20, -5). Test whether the diet reduces cholesterol at level 0.05.

ExerciseAdvanced

Problem

Show that as ν\nu\to\infty, the Welch t-test converges to the Wald z-test, and that the difference between the t-quantile and the z-quantile is O(1/ν)O(1/\nu) for fixed quantile level.

References

Canonical:

  • Casella and Berger, Statistical Inference (2002), Chapter 5 (Section 5.3 on the Student-t sampling distribution), Chapter 8 (Section 8.2 on the t-test).
  • Lehmann and Romano, Testing Statistical Hypotheses (2005), Chapter 5 (UMP-unbiased tests in the Normal family, including the t-test as a UMP-invariant test).
  • Bickel and Doksum, Mathematical Statistics, Volume I (2015), Chapter 4 (testing in the Normal model).

Historical:

  • Student (W. S. Gosset), "The probable error of a mean" (Biometrika, 1908), the original t-distribution paper.
  • Welch, "The generalization of Student's problem when several different population variances are involved" (Biometrika, 1947), Welch-Satterthwaite degrees of freedom.

Rank-based and resampling alternatives:

  • Wilcoxon, "Individual comparisons by ranking methods" (Biometrics Bulletin, 1945), the rank-based alternative.
  • Davison and Hinkley, Bootstrap Methods and Their Application (1997), Chapter 4 (bootstrap and permutation alternatives to the t-test).

Last reviewed: May 11, 2026

Canonical graph

Required before and derived from this topic

These links come from prerequisite edges in the curriculum graph. Editorial suggestions are shown here only when the target page also cites this page as a prerequisite.

Required prerequisites

5

Derived topics

1

Graph-backed continuations