Skip to main content

Statistical Estimation

F-Distribution and ANOVA

The F-distribution as a ratio of two scaled Chi-squareds, and the one-way analysis of variance F-test built on it: between-group versus within-group variance decomposition, exact null distribution under Normality and equal variances, and the link to two-sample t-tests.

CoreTier 1StableCore spine~55 min

Why This Matters

Analysis of variance, ANOVA, is the standard procedure for comparing the means of three or more groups simultaneously. Running pairwise t-tests for gg groups requires (g2)\binom{g}{2} comparisons and inflates the family-wise error rate; ANOVA replaces them with a single F-test whose null distribution is exact under Normality and equal variances. The F-statistic compares the variance of the group means (the "between" variance, which is large when the group means differ) to the variance within groups (the "within" variance, which estimates the common population variance under the null).

The F-distribution itself is the ratio of two scaled Chi-squareds. Every variance-ratio test in classical statistics is an F-test in disguise: variance comparison between two samples, ANOVA F-test, regression overall F-test, nested-model F-test. The reference distribution is the same; the choice of numerator and denominator degrees of freedom changes with the test.

The F-test for g=2g = 2 groups is equivalent to the squared two-sample t-test: F=T2F = T^2, with F1,nX+nY2F_{1,n_X+n_Y-2} identical in distribution to the square of tnX+nY2t_{n_X+n_Y-2} under the null. ANOVA generalizes this to arbitrary gg.

The F-Distribution

Definition

F-Distribution

A random variable XX has an F-distribution with d1d_1 numerator and d2d_2 denominator degrees of freedom if it can be written as

X=V1/d1V2/d2X = \frac{V_1/d_1}{V_2/d_2}

with V1χd12V_1\sim\chi^2_{d_1} and V2χd22V_2\sim\chi^2_{d_2} independent. The density is

fX(x)=1B(d1/2,d2/2)(d1d2)d1/2xd1/21(1+d1x/d2)(d1+d2)/2,x>0.f_X(x) = \frac{1}{B(d_1/2, d_2/2)}\left(\frac{d_1}{d_2}\right)^{d_1/2}\frac{x^{d_1/2 - 1}}{(1 + d_1 x/d_2)^{(d_1+d_2)/2}},\qquad x > 0.

The mean is d2/(d22)d_2/(d_2 - 2) for d2>2d_2 > 2, and the variance is 2d22(d1+d22)/[d1(d22)2(d24)]2 d_2^2(d_1 + d_2 - 2)/[d_1(d_2 - 2)^2(d_2 - 4)] for d2>4d_2 > 4.

The distribution is right-skewed and supported on the positive reals. Both degrees-of-freedom parameters matter: the numerator d1d_1 controls the spread of the between-variance estimate, and the denominator d2d_2 controls the spread of the within-variance estimate. Larger denominator degrees of freedom gives a more concentrated reference distribution.

F as a Ratio of Chi-Squareds

Theorem

F as a Ratio of Independent Chi-Squareds

Statement

Let V1χd12V_1\sim\chi^2_{d_1} and V2χd22V_2\sim\chi^2_{d_2} be independent. Then F=(V1/d1)/(V2/d2)Fd1,d2F = (V_1/d_1)/(V_2/d_2)\sim F_{d_1,d_2}. Moreover, 1/FFd2,d11/F\sim F_{d_2,d_1}, and the α\alpha quantile of Fd1,d2F_{d_1,d_2} equals the reciprocal of the 1α1 - \alpha quantile of Fd2,d1F_{d_2,d_1}.

Intuition

Each Chi-squared is normalized by its own degrees of freedom so that Vi/di1V_i/d_i\to 1 in probability as did_i\to\infty. The ratio approaches one under the null, with spread that shrinks as both degrees of freedom grow. The reciprocal-symmetry result lets you read lower-tail quantiles from upper-tail quantiles of the swapped distribution.

Proof Sketch

The joint density of (V1,V2)(V_1, V_2) factors by independence. Change variables to (F,V2)=((V1/d1)/(V2/d2),V2)(F, V_2) = ((V_1/d_1)/(V_2/d_2), V_2) with Jacobian V2d1/d2V_2 d_1/d_2. Substitute and integrate out V2V_2 using the Gamma normalizing constant. The result is the F-density above. The reciprocal claim follows because if V1,V2V_1, V_2 are interchanged, FF is replaced by 1/F1/F.

Why It Matters

Every classical F-test has the form (numerator-variance estimate) / (denominator-variance estimate), where both estimates are scaled Chi-squareds when the underlying data are Normal. ANOVA, variance-comparison tests, and regression F-tests all fit this pattern. The asymptotic Chi-squared distribution of the LRT in regular models, multiplied by an appropriate factor, also gives a small-sample F approximation.

Failure Mode

The construction requires independent Chi-squareds. ANOVA, regression, and variance-comparison tests use specific quadratic forms whose independence is guaranteed by Cochran's theorem applied to Normal samples. For non-Normal samples, the components are only approximately Chi-squared and only approximately independent, so the F-test is asymptotic and its accuracy depends on tail weight.

The ANOVA Decomposition

Theorem

One-Way ANOVA Variance Decomposition

Statement

Let Yˉi\bar Y_{i\cdot} be the mean of group ii and Yˉ\bar Y_{\cdot\cdot} the grand mean. Then i=1gj=1ni(YijYˉ)2SST (total)=i=1gni(YˉiYˉ)2SSB (between)+i=1gj=1ni(YijYˉi)2SSW (within).\underbrace{\sum_{i=1}^g\sum_{j=1}^{n_i}(Y_{ij} - \bar Y_{\cdot\cdot})^2}_{\text{SST (total)}} = \underbrace{\sum_{i=1}^g n_i(\bar Y_{i\cdot} - \bar Y_{\cdot\cdot})^2}_{\text{SSB (between)}} + \underbrace{\sum_{i=1}^g\sum_{j=1}^{n_i}(Y_{ij} - \bar Y_{i\cdot})^2}_{\text{SSW (within)}}. The decomposition is algebraic and holds for every dataset, regardless of the model.

Intuition

Total sum of squares is the squared distance from each point to the grand mean. Decompose each deviation YijYˉY_{ij} - \bar Y_{\cdot\cdot} as (YijYˉi)+(YˉiYˉ)(Y_{ij} - \bar Y_{i\cdot}) + (\bar Y_{i\cdot} - \bar Y_{\cdot\cdot}). The cross term vanishes after summing within each group (because deviations from a group mean sum to zero within the group), leaving SST = SSW + SSB.

Proof Sketch

Algebraic identity: YijYˉ=(YijYˉi)+(YˉiYˉ).Y_{ij} - \bar Y_{\cdot\cdot} = (Y_{ij} - \bar Y_{i\cdot}) + (\bar Y_{i\cdot} - \bar Y_{\cdot\cdot}). Square both sides and sum over i,ji, j. The cross-term sum is 2ij(YijYˉi)(YˉiYˉ)=2i(YˉiYˉ)j(YijYˉi)=0,2\sum_i\sum_j(Y_{ij} - \bar Y_{i\cdot})(\bar Y_{i\cdot} - \bar Y_{\cdot\cdot}) = 2\sum_i(\bar Y_{i\cdot} - \bar Y_{\cdot\cdot})\sum_j(Y_{ij} - \bar Y_{i\cdot}) = 0, because the inner sum is zero by the definition of the group mean. The remaining two sums of squares are SSW and ini(YˉiYˉ)2=SSB\sum_i n_i(\bar Y_{i\cdot} - \bar Y_{\cdot\cdot})^2 = \text{SSB}.

Why It Matters

The decomposition is the algebraic foundation for the F-test. Under the model Yij=μ+τi+ϵijY_{ij} = \mu + \tau_i + \epsilon_{ij} with ϵij\epsilon_{ij} i.i.d. Normal and niτi=0\sum n_i\tau_i = 0, the cross-product structure makes the between and within sums of squares independent. Each becomes a Chi-squared after dividing by σ2\sigma^2, and their ratio (with appropriate degrees-of-freedom normalization) is the F-statistic.

Failure Mode

The decomposition is purely algebraic and always holds, but the inferential interpretation of SSB and SSW as variance estimates of σ2\sigma^2 requires the Normal-i.i.d.-equal-variance assumption. Heteroscedasticity (group-specific variances) breaks the F-test's exactness; non-Normal errors break it more mildly through the central limit theorem. For the heteroscedastic case, the Welch-style ANOVA replaces the pooled within-variance.

One-Way ANOVA F-Test

Theorem

One-Way ANOVA F-Test

Statement

To test H0:μ1==μgH_0: \mu_1 = \cdots = \mu_g against H1H_1: at least two group means differ, compute the F-statistic F=SSB/(g1)SSW/(Ng)=MSBMSW,F = \frac{\text{SSB}/(g-1)}{\text{SSW}/(N-g)} = \frac{\text{MSB}}{\text{MSW}}, where MSB and MSW are the mean squares between and within. Under H0H_0 and the Normal-equal-variance assumption, FFg1,NgF\sim F_{g-1, N-g} exactly. The test rejects at level α\alpha when FF exceeds the 1α1 - \alpha quantile of Fg1,NgF_{g-1, N-g}.

Intuition

Under the null, all group means equal a common μ\mu. SSB measures variability of the sample group means around the grand mean. SSW measures variability within groups around their own means. Both estimate σ2\sigma^2 under the null: SSB/(g-1) and SSW/(N-g) are both unbiased estimators of σ2\sigma^2. The ratio MSB/MSW concentrates near one under the null and is inflated when group means differ.

Proof Sketch

Under the Normal-equal-variance model, write each Yij=μ+τi+ϵijY_{ij} = \mu + \tau_i + \epsilon_{ij} with the constraint niτi=0\sum n_i\tau_i = 0. Apply Cochran's theorem to the orthogonal decomposition of the residual space: SSW/σ2χNg2\sigma^2\sim\chi^2_{N-g}, and under the null, SSB/σ2χg12\sigma^2\sim\chi^2_{g-1}. The two are independent (orthogonal projections). The ratio (SSB/(g-1))/(SSW/(N-g)) is a ratio of two independent normalized Chi-squareds, hence Fg1,NgF_{g-1, N-g} by the F construction theorem.

Why It Matters

ANOVA is the standard test for comparing three or more group means in agricultural, industrial, clinical, and behavioral research. The test has high power against any deviation from equality among group means, not just specific contrast directions. When the null is rejected, follow up with post-hoc pairwise comparisons (Tukey's HSD, Scheffe, Bonferroni) to localize which means differ; see p-hacking and multiple testing for the multiple-comparison correction discussion.

Failure Mode

The exact F distribution requires Normality, equal variances, and independence. Modest violations of Normality are tolerable in large samples by the central limit theorem; unequal variances are more damaging. Welch's ANOVA generalizes Welch's t-test to multiple groups and is the right alternative when equal variances are implausible. Independence violations (e.g., repeated measures on the same subject) require a mixed-effects model or a randomization-test approach.

Variance-Ratio Test

The F-distribution also drives the variance-ratio test for comparing two variances.

Theorem

F-Test for Equal Variances

Statement

To test H0:σX2=σY2H_0:\sigma_X^2 = \sigma_Y^2, compute the ratio of sample variances F=SX2SY2.F = \frac{S_X^2}{S_Y^2}. Under H0H_0 and the Normal assumption, FFm1,n1F\sim F_{m-1, n-1} exactly. A two-sided test at level α\alpha rejects when F<Fm1,n1,α/2F < F_{m-1, n-1, \alpha/2} or F>Fm1,n1,1α/2F > F_{m-1, n-1, 1-\alpha/2}.

Intuition

Both (m1)SX2/σX2(m-1)S_X^2/\sigma_X^2 and (n1)SY2/σY2(n-1)S_Y^2/\sigma_Y^2 are Chi-squared with their respective degrees of freedom. Their ratio, normalized by degrees of freedom, is F-distributed. Under the null of equal variances the σ2\sigma^2 cancels and the statistic is exactly Fm1,n1F_{m-1, n-1}.

Proof Sketch

(m1)SX2/σX2χm12(m-1)S_X^2/\sigma_X^2\sim\chi^2_{m-1} and (n1)SY2/σY2χn12(n-1)S_Y^2/\sigma_Y^2\sim\chi^2_{n-1} independently. The ratio is SX2/σX2SY2/σY2=(m1)SX2/σX2/(m1)(n1)SY2/σY2/(n1)Fm1,n1.\frac{S_X^2/\sigma_X^2}{S_Y^2/\sigma_Y^2} = \frac{(m-1)S_X^2/\sigma_X^2/(m-1)}{(n-1)S_Y^2/\sigma_Y^2/(n-1)}\sim F_{m-1, n-1}. Under the null σX2=σY2\sigma_X^2 = \sigma_Y^2, this equals SX2/SY2S_X^2/S_Y^2.

Why It Matters

The variance-ratio F-test is used as a preliminary check before deciding whether to use the equal-variance pooled t-test or the Welch t-test. In practice, the test has low power for small samples and high sensitivity to non-Normality; Levene's test (which uses absolute deviations from group medians) is the standard alternative under non-Normality and is preferred in modern statistical practice.

Failure Mode

The F-test for equal variances is notoriously sensitive to non-Normality, more so than the t-test for equal means. Even modest skewness or kurtosis substantially inflates the Type I error. Do not use this test as a "screening" decision for whether to pool variances; just use Welch's t-test directly, which avoids needing the equality test in the first place.

ANOVA versus Two-Sample t-Test

When g=2g = 2, the one-way ANOVA F-test is equivalent to the two-sample equal-variance t-test in the following sense:

F1,N2=TN22,F_{1, N-2} = T_{N-2}^2,

that is, the F-statistic at g=2g = 2 is the square of the pooled t-statistic, and the F-test rejection region {F>F1,N2,1α}\{F > F_{1, N-2, 1-\alpha}\} is exactly the t-test rejection region {T>tN2,1α/2}\{|T| > t_{N-2, 1-\alpha/2}\}. The two tests are algebraically identical for two groups.

For g3g \ge 3, the ANOVA F-test handles all pairwise comparisons simultaneously with a single Type I error budget. Running (g2)\binom{g}{2} pairwise t-tests at level α\alpha inflates the family-wise error rate to roughly 1(1α)(g2)1 - (1-\alpha)^{\binom{g}{2}}, which is much larger than α\alpha even for modest gg. ANOVA avoids this by collapsing the comparison into a single test; multiple-comparison correction enters at the follow-up stage if the global test rejects.

ANOVA Table Layout

A standard one-way ANOVA report follows the schema:

SourceSum of squaresDegrees of freedomMean squareF
Between groupsSSBg1g - 1MSB = SSB/(g1g - 1)MSB/MSW
Within groups (error)SSWNgN - gMSW = SSW/(NgN - g)--
TotalSSTN1N - 1----

The "Total" row is included for completeness and serves as a check: SST = SSB + SSW by the variance decomposition theorem. The p-value is the area of Fg1,NgF_{g-1, N-g} to the right of the observed F.

Common Confusions

Watch Out

ANOVA tests means, not variances

Despite the name "analysis of variance", the F-test tests equality of means under the assumption of equal variances. The variances are used as nuisance parameters in the test, not as the quantity under test. The test for equal variances is the variance-ratio F-test, a separate procedure.

Watch Out

A significant F does not say which groups differ

ANOVA rejects the null when at least two means differ, but it does not say which pair. Post-hoc tests (Tukey, Scheffe, Bonferroni-corrected pairwise t-tests) localize the differences with corrected error rates. The order of operations is: first test global equality with ANOVA; if rejected, perform corrected pairwise comparisons.

Watch Out

F-test is one-sided by construction

The F-statistic is always nonnegative; the test rejects in the upper tail only. Two-sided p-values for the F-test conflate the right-tail probability and a meaningless left-tail probability. Software reports the right-tail pp-value; do not double it.

Watch Out

Welch ANOVA exists and is often preferable

The standard F-test assumes equal variances. Welch's ANOVA (also called Brown-Forsythe in a related form) uses a denominator that does not assume equal variances and is the right default when the equal-variance assumption is unsupported. R's oneway.test() defaults to Welch's ANOVA.

Exercises

ExerciseCore

Problem

Three teaching methods are compared with n1=n2=n3=5n_1 = n_2 = n_3 = 5 students per group. Group means are (80,85,78)(80, 85, 78) and within-group sample variances are (16,12,14)(16, 12, 14). Compute the F-statistic and conduct the test at level 0.05.

ExerciseCore

Problem

Two samples have SX2=4.5S_X^2 = 4.5 with nX=10n_X = 10 and SY2=1.5S_Y^2 = 1.5 with nY=12n_Y = 12. Test H0:σX2=σY2H_0:\sigma_X^2 = \sigma_Y^2 at level 0.05.

ExerciseAdvanced

Problem

Show that the F-statistic in a one-way ANOVA with g=2g = 2 groups, sample sizes nXn_X and nYn_Y, equals the square of the pooled two-sample t-statistic.

ExerciseAdvanced

Problem

For a one-way ANOVA with gg groups and NN total observations, derive the noncentrality parameter of the F-statistic under the alternative H1:μi=μ+τiH_1: \mu_i = \mu + \tau_i with niτi=0\sum n_i\tau_i = 0, and explain how power increases with the noncentrality.

References

Canonical:

  • Casella and Berger, Statistical Inference (2002), Chapter 5 (sampling distribution of the F), Chapter 11 (linear models and the F-test as a likelihood-ratio test).
  • Lehmann and Romano, Testing Statistical Hypotheses (2005), Chapter 7 (UMP-invariant tests in the linear model).
  • Scheffé, The Analysis of Variance (1959), the classical reference for ANOVA theory and post-hoc methods.

Practical:

  • Box, Hunter, and Hunter, Statistics for Experimenters (2005), Chapter 5 (one-way ANOVA in industrial experiments).
  • Cochran and Cox, Experimental Designs (1957), Chapter 4 (ANOVA in randomized designs).

Alternatives without the equal-variance assumption:

  • Welch, "On the comparison of several mean values: An alternative approach" (Biometrika, 1951), Welch's ANOVA for unequal variances.
  • Brown and Forsythe, "The small sample behavior of some statistics which test the equality of several means" (Technometrics, 1974), ANOVA variants that drop the equal-variance assumption.

Last reviewed: May 11, 2026

Canonical graph

Required before and derived from this topic

These links come from prerequisite edges in the curriculum graph. Editorial suggestions are shown here only when the target page also cites this page as a prerequisite.

Required prerequisites

4

Derived topics

0

No published topic currently declares this as a prerequisite.