F-Distribution and ANOVA

Sneiderman, Robby

Statistical Estimation

F-Distribution and ANOVA

The F-distribution as a ratio of two scaled Chi-squareds, and the one-way analysis of variance F-test built on it: between-group versus within-group variance decomposition, exact null distribution under Normality and equal variances, and the link to two-sample t-tests.

CoreTier 1StableCore spine~55 min

Prerequisites

Distributions Atlas Chi Squared Distribution and Tests Hypothesis Testing for ML Student T Distribution and T Test

Prereq Map

Why This Matters

Analysis of variance, ANOVA, is the standard procedure for comparing the means of three or more groups simultaneously. Running pairwise t-tests for $g$ groups requires $\binom{g}{2}$ comparisons and inflates the family-wise error rate; ANOVA replaces them with a single F-test whose null distribution is exact under Normality and equal variances. The F-statistic compares the variance of the group means (the "between" variance, which is large when the group means differ) to the variance within groups (the "within" variance, which estimates the common population variance under the null).

The F-distribution itself is the ratio of two scaled Chi-squareds. Every variance-ratio test in classical statistics is an F-test in disguise: variance comparison between two samples, ANOVA F-test, regression overall F-test, nested-model F-test. The reference distribution is the same; the choice of numerator and denominator degrees of freedom changes with the test.

The F-test for $g = 2$ groups is equivalent to the squared two-sample t-test: $F = T^2$ , with $F_{1,n_X+n_Y-2}$ identical in distribution to the square of $t_{n_X+n_Y-2}$ under the null. ANOVA generalizes this to arbitrary $g$ .

The F-Distribution

Definition

F-Distribution $X \sim F_{d_{1}, d_{2}}$

A random variable $X$ has an F-distribution with $d_1$ numerator and $d_2$ denominator degrees of freedom if it can be written as

$X = \frac{V_1/d_1}{V_2/d_2}$

with $V_1\sim\chi^2_{d_1}$ and $V_2\sim\chi^2_{d_2}$ independent. The density is

$f_X(x) = \frac{1}{B(d_1/2, d_2/2)}\left(\frac{d_1}{d_2}\right)^{d_1/2}\frac{x^{d_1/2 - 1}}{(1 + d_1 x/d_2)^{(d_1+d_2)/2}},\qquad x > 0.$

The mean is $d_2/(d_2 - 2)$ for $d_2 > 2$ , and the variance is $2 d_2^2(d_1 + d_2 - 2)/[d_1(d_2 - 2)^2(d_2 - 4)]$ for $d_2 > 4$ .

The distribution is right-skewed and supported on the positive reals. Both degrees-of-freedom parameters matter: the numerator $d_1$ controls the spread of the between-variance estimate, and the denominator $d_2$ controls the spread of the within-variance estimate. Larger denominator degrees of freedom gives a more concentrated reference distribution.

F as a Ratio of Chi-Squareds

Theorem

F as a Ratio of Independent Chi-Squareds

Statement

Let $V_1\sim\chi^2_{d_1}$ and $V_2\sim\chi^2_{d_2}$ be independent. Then $F = (V_1/d_1)/(V_2/d_2)\sim F_{d_1,d_2}$ . Moreover, $1/F\sim F_{d_2,d_1}$ , and the $\alpha$ quantile of $F_{d_1,d_2}$ equals the reciprocal of the $1 - \alpha$ quantile of $F_{d_2,d_1}$ .

Intuition

Each Chi-squared is normalized by its own degrees of freedom so that $V_i/d_i\to 1$ in probability as $d_i\to\infty$ . The ratio approaches one under the null, with spread that shrinks as both degrees of freedom grow. The reciprocal-symmetry result lets you read lower-tail quantiles from upper-tail quantiles of the swapped distribution.

Proof Sketch

The joint density of $(V_1, V_2)$ factors by independence. Change variables to $(F, V_2) = ((V_1/d_1)/(V_2/d_2), V_2)$ with Jacobian $V_2 d_1/d_2$ . Substitute and integrate out $V_2$ using the Gamma normalizing constant. The result is the F-density above. The reciprocal claim follows because if $V_1, V_2$ are interchanged, $F$ is replaced by $1/F$ .

Why It Matters

Every classical F-test has the form (numerator-variance estimate) / (denominator-variance estimate), where both estimates are scaled Chi-squareds when the underlying data are Normal. ANOVA, variance-comparison tests, and regression F-tests all fit this pattern. The asymptotic Chi-squared distribution of the LRT in regular models, multiplied by an appropriate factor, also gives a small-sample F approximation.

Failure Mode

The construction requires independent Chi-squareds. ANOVA, regression, and variance-comparison tests use specific quadratic forms whose independence is guaranteed by Cochran's theorem applied to Normal samples. For non-Normal samples, the components are only approximately Chi-squared and only approximately independent, so the F-test is asymptotic and its accuracy depends on tail weight.

report a correction →

The ANOVA Decomposition

Theorem

One-Way ANOVA Variance Decomposition

Statement

Let $\bar Y_{i\cdot}$ be the mean of group $i$ and $\bar Y_{\cdot\cdot}$ the grand mean. Then $\underbrace{\sum_{i=1}^g\sum_{j=1}^{n_i}(Y_{ij} - \bar Y_{\cdot\cdot})^2}_{\text{SST (total)}} = \underbrace{\sum_{i=1}^g n_i(\bar Y_{i\cdot} - \bar Y_{\cdot\cdot})^2}_{\text{SSB (between)}} + \underbrace{\sum_{i=1}^g\sum_{j=1}^{n_i}(Y_{ij} - \bar Y_{i\cdot})^2}_{\text{SSW (within)}}.$ The decomposition is algebraic and holds for every dataset, regardless of the model.

Intuition

Total sum of squares is the squared distance from each point to the grand mean. Decompose each deviation $Y_{ij} - \bar Y_{\cdot\cdot}$ as $(Y_{ij} - \bar Y_{i\cdot}) + (\bar Y_{i\cdot} - \bar Y_{\cdot\cdot})$ . The cross term vanishes after summing within each group (because deviations from a group mean sum to zero within the group), leaving SST = SSW + SSB.

Proof Sketch

Algebraic identity: $Y_{ij} - \bar Y_{\cdot\cdot} = (Y_{ij} - \bar Y_{i\cdot}) + (\bar Y_{i\cdot} - \bar Y_{\cdot\cdot}).$ Square both sides and sum over $i, j$ . The cross-term sum is $2\sum_i\sum_j(Y_{ij} - \bar Y_{i\cdot})(\bar Y_{i\cdot} - \bar Y_{\cdot\cdot}) = 2\sum_i(\bar Y_{i\cdot} - \bar Y_{\cdot\cdot})\sum_j(Y_{ij} - \bar Y_{i\cdot}) = 0,$ because the inner sum is zero by the definition of the group mean. The remaining two sums of squares are SSW and $\sum_i n_i(\bar Y_{i\cdot} - \bar Y_{\cdot\cdot})^2 = \text{SSB}$ .

Why It Matters

The decomposition is the algebraic foundation for the F-test. Under the model $Y_{ij} = \mu + \tau_i + \epsilon_{ij}$ with $\epsilon_{ij}$ i.i.d. Normal and $\sum n_i\tau_i = 0$ , the cross-product structure makes the between and within sums of squares independent. Each becomes a Chi-squared after dividing by $\sigma^2$ , and their ratio (with appropriate degrees-of-freedom normalization) is the F-statistic.

Failure Mode

The decomposition is purely algebraic and always holds, but the inferential interpretation of SSB and SSW as variance estimates of $\sigma^2$ requires the Normal-i.i.d.-equal-variance assumption. Heteroscedasticity (group-specific variances) breaks the F-test's exactness; non-Normal errors break it more mildly through the central limit theorem. For the heteroscedastic case, the Welch-style ANOVA replaces the pooled within-variance.

report a correction →

One-Way ANOVA F-Test

Theorem

One-Way ANOVA F-Test

Statement

To test $H_0: \mu_1 = \cdots = \mu_g$ against $H_1$ : at least two group means differ, compute the F-statistic $F = \frac{\text{SSB}/(g-1)}{\text{SSW}/(N-g)} = \frac{\text{MSB}}{\text{MSW}},$ where MSB and MSW are the mean squares between and within. Under $H_0$ and the Normal-equal-variance assumption, $F\sim F_{g-1, N-g}$ exactly. The test rejects at level $\alpha$ when $F$ exceeds the $1 - \alpha$ quantile of $F_{g-1, N-g}$ .

Intuition

Under the null, all group means equal a common $\mu$ . SSB measures variability of the sample group means around the grand mean. SSW measures variability within groups around their own means. Both estimate $\sigma^2$ under the null: SSB/(g-1) and SSW/(N-g) are both unbiased estimators of $\sigma^2$ . The ratio MSB/MSW concentrates near one under the null and is inflated when group means differ.

Proof Sketch

Under the Normal-equal-variance model, write each $Y_{ij} = \mu + \tau_i + \epsilon_{ij}$ with the constraint $\sum n_i\tau_i = 0$ . Apply Cochran's theorem to the orthogonal decomposition of the residual space: SSW/ $\sigma^2\sim\chi^2_{N-g}$ , and under the null, SSB/ $\sigma^2\sim\chi^2_{g-1}$ . The two are independent (orthogonal projections). The ratio (SSB/(g-1))/(SSW/(N-g)) is a ratio of two independent normalized Chi-squareds, hence $F_{g-1, N-g}$ by the F construction theorem.

Why It Matters

ANOVA is the standard test for comparing three or more group means in agricultural, industrial, clinical, and behavioral research. The test has high power against any deviation from equality among group means, not just specific contrast directions. When the null is rejected, follow up with post-hoc pairwise comparisons (Tukey's HSD, Scheffe, Bonferroni) to localize which means differ; see p-hacking and multiple testing for the multiple-comparison correction discussion.

Failure Mode

The exact F distribution requires Normality, equal variances, and independence. Modest violations of Normality are tolerable in large samples by the central limit theorem; unequal variances are more damaging. Welch's ANOVA generalizes Welch's t-test to multiple groups and is the right alternative when equal variances are implausible. Independence violations (e.g., repeated measures on the same subject) require a mixed-effects model or a randomization-test approach.

report a correction →

Variance-Ratio Test

The F-distribution also drives the variance-ratio test for comparing two variances.

Theorem

F-Test for Equal Variances

Statement

To test $H_0:\sigma_X^2 = \sigma_Y^2$ , compute the ratio of sample variances $F = \frac{S_X^2}{S_Y^2}.$ Under $H_0$ and the Normal assumption, $F\sim F_{m-1, n-1}$ exactly. A two-sided test at level $\alpha$ rejects when $F < F_{m-1, n-1, \alpha/2}$ or $F > F_{m-1, n-1, 1-\alpha/2}$ .

Intuition

Both $(m-1)S_X^2/\sigma_X^2$ and $(n-1)S_Y^2/\sigma_Y^2$ are Chi-squared with their respective degrees of freedom. Their ratio, normalized by degrees of freedom, is F-distributed. Under the null of equal variances the $\sigma^2$ cancels and the statistic is exactly $F_{m-1, n-1}$ .

Proof Sketch

$(m-1)S_X^2/\sigma_X^2\sim\chi^2_{m-1}$ and $(n-1)S_Y^2/\sigma_Y^2\sim\chi^2_{n-1}$ independently. The ratio is $\frac{S_X^2/\sigma_X^2}{S_Y^2/\sigma_Y^2} = \frac{(m-1)S_X^2/\sigma_X^2/(m-1)}{(n-1)S_Y^2/\sigma_Y^2/(n-1)}\sim F_{m-1, n-1}.$ Under the null $\sigma_X^2 = \sigma_Y^2$ , this equals $S_X^2/S_Y^2$ .

Why It Matters

The variance-ratio F-test is used as a preliminary check before deciding whether to use the equal-variance pooled t-test or the Welch t-test. In practice, the test has low power for small samples and high sensitivity to non-Normality; Levene's test (which uses absolute deviations from group medians) is the standard alternative under non-Normality and is preferred in modern statistical practice.

Failure Mode

The F-test for equal variances is notoriously sensitive to non-Normality, more so than the t-test for equal means. Even modest skewness or kurtosis substantially inflates the Type I error. Do not use this test as a "screening" decision for whether to pool variances; just use Welch's t-test directly, which avoids needing the equality test in the first place.

report a correction →

ANOVA versus Two-Sample t-Test

When $g = 2$ , the one-way ANOVA F-test is equivalent to the two-sample equal-variance t-test in the following sense:

$F_{1, N-2} = T_{N-2}^2,$

that is, the F-statistic at $g = 2$ is the square of the pooled t-statistic, and the F-test rejection region $\{F > F_{1, N-2, 1-\alpha}\}$ is exactly the t-test rejection region $\{|T| > t_{N-2, 1-\alpha/2}\}$ . The two tests are algebraically identical for two groups.

For $g \ge 3$ , the ANOVA F-test handles all pairwise comparisons simultaneously with a single Type I error budget. Running $\binom{g}{2}$ pairwise t-tests at level $\alpha$ inflates the family-wise error rate to roughly $1 - (1-\alpha)^{\binom{g}{2}}$ , which is much larger than $\alpha$ even for modest $g$ . ANOVA avoids this by collapsing the comparison into a single test; multiple-comparison correction enters at the follow-up stage if the global test rejects.

ANOVA Table Layout

A standard one-way ANOVA report follows the schema:

Source	Sum of squares	Degrees of freedom	Mean square	F
Between groups	SSB	$g - 1$	MSB = SSB/( $g - 1$ )	MSB/MSW
Within groups (error)	SSW	$N - g$	MSW = SSW/( $N - g$ )	--
Total	SST	$N - 1$	--	--

The "Total" row is included for completeness and serves as a check: SST = SSB + SSW by the variance decomposition theorem. The p-value is the area of $F_{g-1, N-g}$ to the right of the observed F.

Common Confusions

Watch Out

ANOVA tests means, not variances

Despite the name "analysis of variance", the F-test tests equality of means under the assumption of equal variances. The variances are used as nuisance parameters in the test, not as the quantity under test. The test for equal variances is the variance-ratio F-test, a separate procedure.

Watch Out

A significant F does not say which groups differ

ANOVA rejects the null when at least two means differ, but it does not say which pair. Post-hoc tests (Tukey, Scheffe, Bonferroni-corrected pairwise t-tests) localize the differences with corrected error rates. The order of operations is: first test global equality with ANOVA; if rejected, perform corrected pairwise comparisons.

Watch Out

F-test is one-sided by construction

The F-statistic is always nonnegative; the test rejects in the upper tail only. Two-sided p-values for the F-test conflate the right-tail probability and a meaningless left-tail probability. Software reports the right-tail $p$ -value; do not double it.

Watch Out

Welch ANOVA exists and is often preferable

The standard F-test assumes equal variances. Welch's ANOVA (also called Brown-Forsythe in a related form) uses a denominator that does not assume equal variances and is the right default when the equal-variance assumption is unsupported. R's oneway.test() defaults to Welch's ANOVA.

Exercises

ExerciseCore

Problem

Three teaching methods are compared with $n_1 = n_2 = n_3 = 5$ students per group. Group means are $(80, 85, 78)$ and within-group sample variances are $(16, 12, 14)$ . Compute the F-statistic and conduct the test at level 0.05.

ExerciseCore

Problem

Two samples have $S_X^2 = 4.5$ with $n_X = 10$ and $S_Y^2 = 1.5$ with $n_Y = 12$ . Test $H_0:\sigma_X^2 = \sigma_Y^2$ at level 0.05.

ExerciseAdvanced

Problem

Show that the F-statistic in a one-way ANOVA with $g = 2$ groups, sample sizes $n_X$ and $n_Y$ , equals the square of the pooled two-sample t-statistic.

ExerciseAdvanced

Problem

For a one-way ANOVA with $g$ groups and $N$ total observations, derive the noncentrality parameter of the F-statistic under the alternative $H_1: \mu_i = \mu + \tau_i$ with $\sum n_i\tau_i = 0$ , and explain how power increases with the noncentrality.

References

Canonical:

Casella and Berger, Statistical Inference (2002), Chapter 5 (sampling distribution of the F), Chapter 11 (linear models and the F-test as a likelihood-ratio test).
Lehmann and Romano, Testing Statistical Hypotheses (2005), Chapter 7 (UMP-invariant tests in the linear model).
Scheffé, The Analysis of Variance (1959), the classical reference for ANOVA theory and post-hoc methods.

Practical:

Box, Hunter, and Hunter, Statistics for Experimenters (2005), Chapter 5 (one-way ANOVA in industrial experiments).
Cochran and Cox, Experimental Designs (1957), Chapter 4 (ANOVA in randomized designs).

Alternatives without the equal-variance assumption:

Welch, "On the comparison of several mean values: An alternative approach" (Biometrika, 1951), Welch's ANOVA for unequal variances.
Brown and Forsythe, "The small sample behavior of some statistics which test the equality of several means" (Technometrics, 1974), ANOVA variants that drop the equal-variance assumption.

Last reviewed: May 11, 2026

Canonical graph

Required before and derived from this topic

These links come from prerequisite edges in the curriculum graph. Editorial suggestions are shown here only when the target page also cites this page as a prerequisite.

Full prerequisite chain All derived topics

Required prerequisites

4

Distributions Atlaslayer 0A · tier 1
Chi-Squared Distribution and Testslayer 1 · tier 1
Student-t Distribution and t-Testlayer 1 · tier 1
Hypothesis Testing for MLlayer 2 · tier 2

Derived topics

0

No published topic currently declares this as a prerequisite.