Statistical Estimation
F-Distribution and ANOVA
The F-distribution as a ratio of two scaled Chi-squareds, and the one-way analysis of variance F-test built on it: between-group versus within-group variance decomposition, exact null distribution under Normality and equal variances, and the link to two-sample t-tests.
Prerequisites
Why This Matters
Analysis of variance, ANOVA, is the standard procedure for comparing the means of three or more groups simultaneously. Running pairwise t-tests for groups requires comparisons and inflates the family-wise error rate; ANOVA replaces them with a single F-test whose null distribution is exact under Normality and equal variances. The F-statistic compares the variance of the group means (the "between" variance, which is large when the group means differ) to the variance within groups (the "within" variance, which estimates the common population variance under the null).
The F-distribution itself is the ratio of two scaled Chi-squareds. Every variance-ratio test in classical statistics is an F-test in disguise: variance comparison between two samples, ANOVA F-test, regression overall F-test, nested-model F-test. The reference distribution is the same; the choice of numerator and denominator degrees of freedom changes with the test.
The F-test for groups is equivalent to the squared two-sample t-test: , with identical in distribution to the square of under the null. ANOVA generalizes this to arbitrary .
The F-Distribution
F-Distribution
A random variable has an F-distribution with numerator and denominator degrees of freedom if it can be written as
with and independent. The density is
The mean is for , and the variance is for .
The distribution is right-skewed and supported on the positive reals. Both degrees-of-freedom parameters matter: the numerator controls the spread of the between-variance estimate, and the denominator controls the spread of the within-variance estimate. Larger denominator degrees of freedom gives a more concentrated reference distribution.
F as a Ratio of Chi-Squareds
F as a Ratio of Independent Chi-Squareds
Statement
Let and be independent. Then . Moreover, , and the quantile of equals the reciprocal of the quantile of .
Intuition
Each Chi-squared is normalized by its own degrees of freedom so that in probability as . The ratio approaches one under the null, with spread that shrinks as both degrees of freedom grow. The reciprocal-symmetry result lets you read lower-tail quantiles from upper-tail quantiles of the swapped distribution.
Proof Sketch
The joint density of factors by independence. Change variables to with Jacobian . Substitute and integrate out using the Gamma normalizing constant. The result is the F-density above. The reciprocal claim follows because if are interchanged, is replaced by .
Why It Matters
Every classical F-test has the form (numerator-variance estimate) / (denominator-variance estimate), where both estimates are scaled Chi-squareds when the underlying data are Normal. ANOVA, variance-comparison tests, and regression F-tests all fit this pattern. The asymptotic Chi-squared distribution of the LRT in regular models, multiplied by an appropriate factor, also gives a small-sample F approximation.
Failure Mode
The construction requires independent Chi-squareds. ANOVA, regression, and variance-comparison tests use specific quadratic forms whose independence is guaranteed by Cochran's theorem applied to Normal samples. For non-Normal samples, the components are only approximately Chi-squared and only approximately independent, so the F-test is asymptotic and its accuracy depends on tail weight.
The ANOVA Decomposition
One-Way ANOVA Variance Decomposition
Statement
Let be the mean of group and the grand mean. Then The decomposition is algebraic and holds for every dataset, regardless of the model.
Intuition
Total sum of squares is the squared distance from each point to the grand mean. Decompose each deviation as . The cross term vanishes after summing within each group (because deviations from a group mean sum to zero within the group), leaving SST = SSW + SSB.
Proof Sketch
Algebraic identity: Square both sides and sum over . The cross-term sum is because the inner sum is zero by the definition of the group mean. The remaining two sums of squares are SSW and .
Why It Matters
The decomposition is the algebraic foundation for the F-test. Under the model with i.i.d. Normal and , the cross-product structure makes the between and within sums of squares independent. Each becomes a Chi-squared after dividing by , and their ratio (with appropriate degrees-of-freedom normalization) is the F-statistic.
Failure Mode
The decomposition is purely algebraic and always holds, but the inferential interpretation of SSB and SSW as variance estimates of requires the Normal-i.i.d.-equal-variance assumption. Heteroscedasticity (group-specific variances) breaks the F-test's exactness; non-Normal errors break it more mildly through the central limit theorem. For the heteroscedastic case, the Welch-style ANOVA replaces the pooled within-variance.
One-Way ANOVA F-Test
One-Way ANOVA F-Test
Statement
To test against : at least two group means differ, compute the F-statistic where MSB and MSW are the mean squares between and within. Under and the Normal-equal-variance assumption, exactly. The test rejects at level when exceeds the quantile of .
Intuition
Under the null, all group means equal a common . SSB measures variability of the sample group means around the grand mean. SSW measures variability within groups around their own means. Both estimate under the null: SSB/(g-1) and SSW/(N-g) are both unbiased estimators of . The ratio MSB/MSW concentrates near one under the null and is inflated when group means differ.
Proof Sketch
Under the Normal-equal-variance model, write each with the constraint . Apply Cochran's theorem to the orthogonal decomposition of the residual space: SSW/, and under the null, SSB/. The two are independent (orthogonal projections). The ratio (SSB/(g-1))/(SSW/(N-g)) is a ratio of two independent normalized Chi-squareds, hence by the F construction theorem.
Why It Matters
ANOVA is the standard test for comparing three or more group means in agricultural, industrial, clinical, and behavioral research. The test has high power against any deviation from equality among group means, not just specific contrast directions. When the null is rejected, follow up with post-hoc pairwise comparisons (Tukey's HSD, Scheffe, Bonferroni) to localize which means differ; see p-hacking and multiple testing for the multiple-comparison correction discussion.
Failure Mode
The exact F distribution requires Normality, equal variances, and independence. Modest violations of Normality are tolerable in large samples by the central limit theorem; unequal variances are more damaging. Welch's ANOVA generalizes Welch's t-test to multiple groups and is the right alternative when equal variances are implausible. Independence violations (e.g., repeated measures on the same subject) require a mixed-effects model or a randomization-test approach.
Variance-Ratio Test
The F-distribution also drives the variance-ratio test for comparing two variances.
F-Test for Equal Variances
Statement
To test , compute the ratio of sample variances Under and the Normal assumption, exactly. A two-sided test at level rejects when or .
Intuition
Both and are Chi-squared with their respective degrees of freedom. Their ratio, normalized by degrees of freedom, is F-distributed. Under the null of equal variances the cancels and the statistic is exactly .
Proof Sketch
and independently. The ratio is Under the null , this equals .
Why It Matters
The variance-ratio F-test is used as a preliminary check before deciding whether to use the equal-variance pooled t-test or the Welch t-test. In practice, the test has low power for small samples and high sensitivity to non-Normality; Levene's test (which uses absolute deviations from group medians) is the standard alternative under non-Normality and is preferred in modern statistical practice.
Failure Mode
The F-test for equal variances is notoriously sensitive to non-Normality, more so than the t-test for equal means. Even modest skewness or kurtosis substantially inflates the Type I error. Do not use this test as a "screening" decision for whether to pool variances; just use Welch's t-test directly, which avoids needing the equality test in the first place.
ANOVA versus Two-Sample t-Test
When , the one-way ANOVA F-test is equivalent to the two-sample equal-variance t-test in the following sense:
that is, the F-statistic at is the square of the pooled t-statistic, and the F-test rejection region is exactly the t-test rejection region . The two tests are algebraically identical for two groups.
For , the ANOVA F-test handles all pairwise comparisons simultaneously with a single Type I error budget. Running pairwise t-tests at level inflates the family-wise error rate to roughly , which is much larger than even for modest . ANOVA avoids this by collapsing the comparison into a single test; multiple-comparison correction enters at the follow-up stage if the global test rejects.
ANOVA Table Layout
A standard one-way ANOVA report follows the schema:
| Source | Sum of squares | Degrees of freedom | Mean square | F |
|---|---|---|---|---|
| Between groups | SSB | MSB = SSB/() | MSB/MSW | |
| Within groups (error) | SSW | MSW = SSW/() | -- | |
| Total | SST | -- | -- |
The "Total" row is included for completeness and serves as a check: SST = SSB + SSW by the variance decomposition theorem. The p-value is the area of to the right of the observed F.
Common Confusions
ANOVA tests means, not variances
Despite the name "analysis of variance", the F-test tests equality of means under the assumption of equal variances. The variances are used as nuisance parameters in the test, not as the quantity under test. The test for equal variances is the variance-ratio F-test, a separate procedure.
A significant F does not say which groups differ
ANOVA rejects the null when at least two means differ, but it does not say which pair. Post-hoc tests (Tukey, Scheffe, Bonferroni-corrected pairwise t-tests) localize the differences with corrected error rates. The order of operations is: first test global equality with ANOVA; if rejected, perform corrected pairwise comparisons.
F-test is one-sided by construction
The F-statistic is always nonnegative; the test rejects in the upper tail only. Two-sided p-values for the F-test conflate the right-tail probability and a meaningless left-tail probability. Software reports the right-tail -value; do not double it.
Welch ANOVA exists and is often preferable
The standard F-test assumes equal variances. Welch's ANOVA (also called Brown-Forsythe in a related form) uses a denominator that does not assume equal variances and is the right default when the equal-variance assumption is unsupported. R's oneway.test() defaults to Welch's ANOVA.
Exercises
Problem
Three teaching methods are compared with students per group. Group means are and within-group sample variances are . Compute the F-statistic and conduct the test at level 0.05.
Problem
Two samples have with and with . Test at level 0.05.
Problem
Show that the F-statistic in a one-way ANOVA with groups, sample sizes and , equals the square of the pooled two-sample t-statistic.
Problem
For a one-way ANOVA with groups and total observations, derive the noncentrality parameter of the F-statistic under the alternative with , and explain how power increases with the noncentrality.
References
Canonical:
- Casella and Berger, Statistical Inference (2002), Chapter 5 (sampling distribution of the F), Chapter 11 (linear models and the F-test as a likelihood-ratio test).
- Lehmann and Romano, Testing Statistical Hypotheses (2005), Chapter 7 (UMP-invariant tests in the linear model).
- Scheffé, The Analysis of Variance (1959), the classical reference for ANOVA theory and post-hoc methods.
Practical:
- Box, Hunter, and Hunter, Statistics for Experimenters (2005), Chapter 5 (one-way ANOVA in industrial experiments).
- Cochran and Cox, Experimental Designs (1957), Chapter 4 (ANOVA in randomized designs).
Alternatives without the equal-variance assumption:
- Welch, "On the comparison of several mean values: An alternative approach" (Biometrika, 1951), Welch's ANOVA for unequal variances.
- Brown and Forsythe, "The small sample behavior of some statistics which test the equality of several means" (Technometrics, 1974), ANOVA variants that drop the equal-variance assumption.
Last reviewed: May 11, 2026
Canonical graph
Required before and derived from this topic
These links come from prerequisite edges in the curriculum graph. Editorial suggestions are shown here only when the target page also cites this page as a prerequisite.
Required prerequisites
4- Distributions Atlaslayer 0A · tier 1
- Chi-Squared Distribution and Testslayer 1 · tier 1
- Student-t Distribution and t-Testlayer 1 · tier 1
- Hypothesis Testing for MLlayer 2 · tier 2
Derived topics
0No published topic currently declares this as a prerequisite.