Student-t Distribution and t-Test

Sneiderman, Robby

Statistical Estimation

Student-t Distribution and t-Test

The Student-t distribution as a ratio of a standard Normal and a root-Chi-squared, and the one-sample, two-sample, and paired t-tests it powers: exact null distribution under Normality, Welch correction for unequal variances, and large-sample equivalence to the Wald z-test.

EssentialCoreTier 1StableCore spine~60 min

For:StatsActuarialGeneral

Prerequisites

Distributions Atlas Normal Distribution Chi Squared Distribution and Tests Central Limit Theorem

Prereq Map

Why This Matters

The Student-t distribution is the exact sampling distribution of the standardized sample mean from an i.i.d. Normal sample with unknown variance. That single fact powers the most-used parametric test in statistics: the t-test. Three flavors of the t-test all rest on the same exact sampling-distribution result, with different choices of numerator and denominator:

One-sample t-test. Compares a sample mean to a hypothesized value. The statistic is $(\bar X_n - \mu_0)/(S/\sqrt n)$ , exactly $t_{n-1}$ under Normality and the null.
Two-sample t-test, equal variances. Compares two independent sample means with a pooled variance estimate. The statistic is $(\bar X - \bar Y)/(S_p\sqrt{1/n_X + 1/n_Y})$ , exactly $t_{n_X + n_Y - 2}$ under common-variance Normality and the null.
Welch's t-test. Compares two independent sample means without the equal-variance assumption. The statistic is approximately $t_\nu$ with Welch-Satterthwaite degrees of freedom $\nu$ .
Paired t-test. Reduces a paired comparison to a one-sample t-test on the differences.

All four are exact (or near-exact) under Normality. The large-sample behavior is the Wald z-test: as degrees of freedom grow, $t_\nu\to\mathcal{N}(0,1)$ , and the t-test merges into the asymptotic normal test that the central limit theorem produces.

The Student-t Distribution

Definition

Student-t Distribution $X \sim t_{ν}$

A random variable $X$ has a Student-t distribution with $\nu > 0$ degrees of freedom if its density is

$f_X(x) = \frac{\Gamma((\nu+1)/2)}{\sqrt{\nu\pi}\,\Gamma(\nu/2)}\left(1+\frac{x^2}{\nu}\right)^{-(\nu+1)/2},\qquad x\in\mathbb{R}.$

The distribution is symmetric about zero. For $\nu > 1$ the mean is zero. For $\nu > 2$ the variance is $\nu/(\nu - 2)$ . The MGF is infinite for every nonzero $s$ ; the distribution is heavy-tailed with polynomial tail decay of order $x^{-(\nu+1)}$ .

The parameter $\nu$ controls tail weight. Small $\nu$ gives very heavy tails (Cauchy at $\nu = 1$ , infinite variance for $\nu\le 2$ ). As $\nu\to\infty$ the Student-t converges to the standard Normal; for $\nu > 30$ the two are nearly indistinguishable in the body of the distribution.

Student-t Construction

Theorem

Student-t as Ratio of Normal and Root Chi-squared

Statement

Let $Z\sim\mathcal{N}(0,1)$ and $V\sim\chi^2_\nu$ be independent. Then $T = \frac{Z}{\sqrt{V/\nu}}\sim t_\nu.$

Intuition

$Z$ is the source of unit-variance Normal noise. $V/\nu$ is an empirical estimate of unit variance (since $\mathbb{E}[V] = \nu$ ); it converges to $1$ as $\nu\to\infty$ . Dividing $Z$ by a noisy estimate of the scale inflates the tails by a polynomial amount. The thicker the noise (smaller $\nu$ ), the heavier the tails of $T$ .

Proof Sketch

Joint density of $(Z, V)$ factors by independence: $f_{Z,V}(z, v) = \varphi(z)\cdot f_{\chi^2_\nu}(v)$ . Change variables to $(T, V) = (Z/\sqrt{V/\nu}, V)$ with $z = t\sqrt{v/\nu}$ and Jacobian $\sqrt{v/\nu}$ . After substitution, the joint density of $(T, V)$ is proportional to $v^{\nu/2 - 1/2}e^{-v(1+t^2/\nu)/2}\sqrt{v/\nu}$ . Integrating over $v > 0$ uses the Gamma normalizing constant and yields the stated density.

Why It Matters

The sample mean of an i.i.d. Normal sample has numerator $\sqrt n(\bar X - \mu)/\sigma$ that is standard Normal, and denominator $S/\sigma$ where $S^2/\sigma^2$ is $\chi^2_{n-1}/(n-1)$ (i.e., a Chi-squared divided by its degrees of freedom). Independence of $\bar X$ and $S^2$ for Normal samples (see normal distribution) is what makes the standardized statistic exactly $t_{n-1}$ rather than an arbitrary ratio.

Failure Mode

Independence of $Z$ and $V$ is essential. In the t-test the relevant $Z$ is $\sqrt n(\bar X - \mu)/\sigma$ and the relevant $V$ is $(n-1)S^2/\sigma^2$ ; their independence is a consequence of Cochran's theorem applied to Normal samples. For non-Normal samples, $\bar X$ and $S^2$ are asymptotically uncorrelated but not independent, so the t-distribution is exact only under Normality. Outside Normality, the test is asymptotic, and its accuracy in moderate samples depends on tail weight and skewness.

report a correction →

One-Sample t-Test

Theorem

One-Sample t-Test

Statement

To test $H_0: \mu = \mu_0$ against $H_1: \mu\ne\mu_0$ , use $T = \frac{\bar X_n - \mu_0}{S/\sqrt n},\qquad S^2 = \frac{1}{n-1}\sum_{i=1}^n(X_i - \bar X_n)^2.$ Under $H_0$ , $T\sim t_{n-1}$ exactly. The two-sided test rejects at level $\alpha$ when $|T| > t_{n-1,1-\alpha/2}$ , the $(1-\alpha/2)$ quantile of $t_{n-1}$ . The one-sided tests against $\mu > \mu_0$ or $\mu < \mu_0$ use the corresponding tail.

Intuition

$T$ is a standardized sample mean. Under the null, the numerator has standard error $\sigma/\sqrt n$ , so $\sqrt n(\bar X - \mu_0)/\sigma$ is standard Normal. The denominator divides by the sample estimate $S$ instead of the true $\sigma$ . The ratio is Normal divided by the root of a normalized Chi-squared, which is exactly Student-t with $n - 1$ degrees of freedom.

Proof Sketch

Under Normality, $\sqrt n(\bar X - \mu)/\sigma\sim\mathcal{N}(0,1)$ and $(n-1)S^2/\sigma^2\sim\chi^2_{n-1}$ , with the two independent (see normal distribution). The statistic is $T = \frac{\sqrt n(\bar X - \mu_0)/\sigma}{\sqrt{(n-1)S^2/(\sigma^2(n-1))}} = \frac{Z}{\sqrt{V/(n-1)}},$ which is $t_{n-1}$ by the construction theorem.

Why It Matters

The one-sample t-test is the basic parametric test for a sample mean. It is the test you reach for when you want to know whether a sample mean differs from a fixed reference value. The 95% confidence interval $\bar X_n\pm t_{n-1, 0.975}\cdot S/\sqrt n$ is the inverted test region. Both the test and the interval are exact under Normality and asymptotically valid (with size $\to\alpha$ and correct coverage) under any distribution with finite variance, by the central limit theorem combined with the asymptotic equivalence of $t_{n-1}$ and the standard Normal as $n\to\infty$ .

Failure Mode

The exact $t_{n-1}$ distribution requires Normal data. With heavy-tailed data, the t-statistic has heavier tails than $t_{n-1}$ predicts, and rejection rates are inflated above the nominal level. With skewed data and small samples, the test is biased in the direction of the longer tail. Permutation tests (see permutation tests) are the distribution-free alternative.

report a correction →

Two-Sample t-Test, Equal Variances

Theorem

Two-Sample t-Test with Common Variance

Statement

To test $H_0: \mu_X = \mu_Y$ , define the pooled variance $S_p^2 = \frac{(n_X - 1)S_X^2 + (n_Y - 1)S_Y^2}{n_X + n_Y - 2}$ and the statistic $T = \frac{\bar X - \bar Y}{S_p\sqrt{1/n_X + 1/n_Y}}.$ Under $H_0$ and common variance, $T\sim t_{n_X + n_Y - 2}$ exactly.

Intuition

The pooled variance $S_p^2$ averages the two sample variances, weighted by their degrees of freedom. Under the common-variance assumption, $(n_X + n_Y - 2)S_p^2/\sigma^2\sim\chi^2_{n_X+n_Y-2}$ , and the difference of sample means $\bar X - \bar Y$ is independent of $S_p^2$ . The standardized difference is therefore Normal-over-root-Chi-squared with the pooled degrees of freedom.

Proof Sketch

Under common variance, $\bar X - \bar Y\sim\mathcal{N}(\mu_X - \mu_Y, \sigma^2(1/n_X + 1/n_Y))$ . The two sample variances $S_X^2, S_Y^2$ scaled by $\sigma^2$ are independent Chi-squareds with $n_X - 1$ and $n_Y - 1$ degrees of freedom. Their sum scaled by $\sigma^2$ is $\chi^2_{n_X + n_Y - 2}$ by Chi-squared additivity. The numerator of $T$ standardized by $\sigma\sqrt{1/n_X + 1/n_Y}$ is standard Normal; the denominator is $S_p/\sigma$ , a root of a normalized Chi-squared. Independence of mean and variance for Normal samples extends to the pooled estimate.

Why It Matters

This is the canonical parametric test for "did treatment A change the mean compared to treatment B" under the simplifying assumption that the two groups share a common variance. It is the test the original Gosset paper introduced under the "Student" pseudonym (Biometrika, 1908). The same procedure gives a confidence interval for $\mu_X - \mu_Y$ by inversion.

Failure Mode

The common-variance assumption matters. With unequal variances and unequal sample sizes, the pooled t-test has the wrong size: rejection rates can be much higher or lower than nominal, depending on which group has more variance and more data. The Welch t-test below is the right replacement and is the default in modern software.

report a correction →

Welch's t-Test

Theorem

Welch t-Test for Unequal Variances

Statement

The Welch statistic is $T_W = \frac{\bar X - \bar Y}{\sqrt{S_X^2/n_X + S_Y^2/n_Y}},$ and is approximately $t_\nu$ under the null with Welch-Satterthwaite degrees of freedom $\nu = \frac{(S_X^2/n_X + S_Y^2/n_Y)^2}{(S_X^2/n_X)^2/(n_X-1) + (S_Y^2/n_Y)^2/(n_Y-1)}.$

Intuition

The denominator uses each group's own variance estimate. The price for not pooling is that the denominator is not a multiple of a single Chi-squared, so the statistic is not exactly t-distributed. The Welch-Satterthwaite approximation matches the first two moments of the denominator squared to those of a scaled Chi-squared, and the resulting degrees of freedom is the moment-matched degrees of freedom. The approximation is accurate even at moderate sample sizes when the variance ratio is far from one.

Proof Sketch

The denominator squared $S_X^2/n_X + S_Y^2/n_Y$ is a linear combination of two independent scaled Chi-squareds, $\sigma_X^2/n_X\cdot\chi^2_{n_X-1}/(n_X-1)$ and $\sigma_Y^2/n_Y\cdot\chi^2_{n_Y-1}/(n_Y-1)$ . Satterthwaite's approximation matches its first two moments to those of $c\chi^2_\nu/\nu$ for some $c, \nu$ . The matching gives $c = \sigma_X^2/n_X + \sigma_Y^2/n_Y$ and the stated formula for $\nu$ (with sample variances substituted for population variances).

Why It Matters

Welch's test is the default two-sample t-test in R (t.test(...) without var.equal = TRUE), SciPy (stats.ttest_ind(..., equal_var=False)), and most modern statistical software. Use it unless you have a specific reason to believe the variances are equal. The cost over the equal-variance pooled test is a fractional reduction in degrees of freedom, which is negligible at moderate sample sizes.

Failure Mode

Welch's test still assumes Normal data within each group, although the approximation degrades more gracefully under non-Normality than the exact pooled test. For heavy-tailed or strongly skewed data, prefer a permutation test or a rank-based test (Mann-Whitney). The Welch-Satterthwaite degrees of freedom can be a non-integer; software interpolates the Chi-squared CDF.

report a correction →

Paired t-Test

Theorem

Paired t-Test

Statement

To test $H_0: \delta = 0$ for pairs $(X_i, Y_i)$ with $D_i = X_i - Y_i$ , compute $T = \frac{\bar D_n}{S_D/\sqrt n},\qquad S_D^2 = \frac{1}{n-1}\sum_{i=1}^n(D_i - \bar D_n)^2.$ Under $H_0$ and Normality of the differences, $T\sim t_{n-1}$ exactly.

Intuition

A paired sample reduces to a one-sample t-test on the within-pair differences. The pairing eliminates between-subject variability and increases power compared to a two-sample test on the raw values, provided the pairs are genuinely linked (same subject before and after, matched pairs, twins).

Proof Sketch

The differences $D_1,\dots,D_n$ are i.i.d. Normal under the assumption. Apply the one-sample t-test theorem to $D_i$ with null mean zero.

Why It Matters

Pre-versus-post designs, twin studies, matched-pair clinical trials, and within-subject crossover trials all use the paired t-test. The power gain over a two-sample test is substantial when the within-pair correlation is high; the test exploits that correlation by subtracting out the shared subject-level baseline. Ignoring pairing and using a two-sample test on the raw values gives a correct level (the test is still valid) but lower power.

Failure Mode

The Normality assumption on the differences is what is required, not Normality of each marginal. Two heavily skewed marginals can have nearly Normal differences if the pairing captures the right shared structure; conversely, two Normal marginals can have non-Normal differences if the pairing is loose. Plot the differences before assuming Normality; if heavy-tailed, use a sign test or Wilcoxon signed-rank test.

report a correction →

When to Use Each Test

Setting	Test	Reference distribution	Degrees of freedom
One sample, fixed null mean	One-sample t-test	$t_\nu$	$n - 1$
Two independent samples, equal variances	Pooled two-sample t-test	$t_\nu$	$n_X + n_Y - 2$
Two independent samples, unequal variances	Welch t-test	$t_\nu$ approximately	Welch-Satterthwaite
Paired or matched samples	Paired t-test on differences	$t_\nu$	$n - 1$
Two samples, large $n$ , no Normality assumption	Wald z-test	$\mathcal{N}(0,1)$	none
Two samples, heavy-tailed or small $n$	Permutation test	empirical	none

The Wald z-test is the asymptotic version: replace the t-distribution with the standard Normal. For $n > 30$ or $\nu > 30$ , the t and z critical values agree to within about 5%; for $\nu > 100$ they agree to within 1%. See likelihood-ratio, Wald, and score tests.

Confidence Intervals

The one-sample t-test inverts to a $(1-\alpha)$ confidence interval: $\bar X_n \pm t_{n-1,1-\alpha/2}\cdot\frac{S}{\sqrt n}.$ The two-sample (pooled) and Welch intervals follow the same pattern: take the estimate plus or minus the t-quantile times the standard error. The standard error is $S_p\sqrt{1/n_X + 1/n_Y}$ in the pooled case and $\sqrt{S_X^2/n_X + S_Y^2/n_Y}$ in the Welch case. The paired interval uses $S_D/\sqrt n$ with $n - 1$ degrees of freedom.

Common Confusions

Watch Out

The t-distribution is heavy-tailed even at modest degrees of freedom

For $\nu = 5$ , the t-distribution has visibly heavier tails than the Normal: the 97.5% quantile is $2.57$ instead of $1.96$ . The "t is approximately Normal" intuition holds only for $\nu > 30$ or so. With small samples, use the t-quantile, not the Normal quantile, even when the underlying data look Normal.

Watch Out

Welch versus equal-variance: prefer Welch by default

The cost of using Welch when variances are actually equal is small (a slight loss of power, typically less than 1 percentage point). The cost of using the equal-variance test when variances are unequal can be large: actual rejection rates can be 10% or 20% under a nominal 5% test, depending on the variance ratio and sample-size imbalance.

Watch Out

A high p-value is not evidence of equal means

Failing to reject $H_0: \mu_X = \mu_Y$ means the data are consistent with equal means. It does not mean the means are equal. The confidence interval is the right summary: a 95% interval centered near zero with narrow width indicates "we have ruled out large differences"; a wide interval indicates "we have ruled out very little".

Watch Out

The t-test assumes the population variance is unknown

If the population variance $\sigma^2$ is known (rare in practice, common in textbook problems), the appropriate test is the z-test with $\sigma$ in the denominator, not the t-test with $S$ . The z-test rejects when $|Z| = \sqrt n|\bar X - \mu_0|/\sigma > 1.96$ at the 5% level. The t-test adds an extra source of variability (the sample variance), which is why its critical value is larger.

Exercises

ExerciseCore

Problem

A new manufacturing process produces parts with target length 50 mm. A sample of $n = 25$ parts gives $\bar X = 50.3$ mm and $S = 0.8$ mm. Test $H_0:\mu = 50$ against $H_1:\mu\ne 50$ at level 0.05.

ExerciseCore

Problem

Two independent samples have $n_X = 12$ with $\bar X = 10.2$ and $S_X = 1.5$ , and $n_Y = 10$ with $\bar Y = 9.1$ and $S_Y = 2.0$ . Use Welch's t-test to test $H_0:\mu_X = \mu_Y$ at level 0.05.

ExerciseAdvanced

Problem

A paired study measures cholesterol level for 8 patients before and after a diet. The differences (after minus before, in mg/dL) are $(-12, -8, -15, 3, -10, -7, -20, -5)$ . Test whether the diet reduces cholesterol at level 0.05.

ExerciseAdvanced

Problem

Show that as $\nu\to\infty$ , the Welch t-test converges to the Wald z-test, and that the difference between the t-quantile and the z-quantile is $O(1/\nu)$ for fixed quantile level.

References

Canonical:

Casella and Berger, Statistical Inference (2002), Chapter 5 (Section 5.3 on the Student-t sampling distribution), Chapter 8 (Section 8.2 on the t-test).
Lehmann and Romano, Testing Statistical Hypotheses (2005), Chapter 5 (UMP-unbiased tests in the Normal family, including the t-test as a UMP-invariant test).
Bickel and Doksum, Mathematical Statistics, Volume I (2015), Chapter 4 (testing in the Normal model).

Historical:

Student (W. S. Gosset), "The probable error of a mean" (Biometrika, 1908), the original t-distribution paper.
Welch, "The generalization of Student's problem when several different population variances are involved" (Biometrika, 1947), Welch-Satterthwaite degrees of freedom.

Rank-based and resampling alternatives:

Wilcoxon, "Individual comparisons by ranking methods" (Biometrics Bulletin, 1945), the rank-based alternative.
Davison and Hinkley, Bootstrap Methods and Their Application (1997), Chapter 4 (bootstrap and permutation alternatives to the t-test).

Last reviewed: May 11, 2026

Canonical graph

Required before and derived from this topic

These links come from prerequisite edges in the curriculum graph. Editorial suggestions are shown here only when the target page also cites this page as a prerequisite.

Full prerequisite chain All derived topics

Required prerequisites

5

Distributions Atlaslayer 0A · tier 1
Normal Distributionlayer 0A · tier 1
Central Limit Theoremlayer 0B · tier 1
Chi-Squared Distribution and Testslayer 1 · tier 1
Hypothesis Testing for MLlayer 2 · tier 2

Derived topics

1

F-Distribution and ANOVAlayer 1 · tier 1

Graph-backed continuations

F-Distribution and ANOVA