Skip to main content

Methodology

Prasad-Rao MSE Correction

Why the naive Fay-Herriot MSE is too small once the area variance is estimated, and how the classical Prasad-Rao decomposition adds the missing second-order term.

AdvancedTier 2Stable~45 min
0

Why This Matters

In the basic small area estimation story, the Fay-Herriot predictor looks clean:

θ^i=γiyi+(1γi)xiβ^.\hat{\theta}_i = \gamma_i y_i + (1-\gamma_i) x_i^\top \hat{\beta}.

If the variance component AA were known, its mean squared error has a simple closed form. But in practice AA is not known. It is estimated from the same data.

That one fact breaks the naive uncertainty formula. The predictor is now an EBLUP rather than a BLUP, and the missing uncertainty from estimating AA is often large enough to matter. The classical Prasad-Rao result is the first standard correction that repairs this.

Mental Model

The EBLUP has three distinct sources of error:

  1. noise in the direct estimate itself
  2. error from estimating the regression coefficients
  3. error from estimating the variance component that controls shrinkage

The naive Fay-Herriot MSE keeps the first piece and usually part of the second. Prasad-Rao adds the third piece at the right asymptotic order.

Formal Setup

Definition

Fay-Herriot Model

The area-level Fay-Herriot model is

yi=xiβ+vi+ei,y_i = x_i^\top \beta + v_i + e_i,

with

viN(0,A),eiN(0,Di),v_i \sim N(0, A), \qquad e_i \sim N(0, D_i),

independent across areas, where DiD_i is treated as known. The target is

θi=xiβ+vi.\theta_i = x_i^\top \beta + v_i.

Definition

EBLUP

If AA were known, the BLUP of θi\theta_i would use the shrinkage factor

γi=AA+Di.\gamma_i = \frac{A}{A + D_i}.

The empirical BLUP replaces AA by an estimator A^\hat{A}:

θ^iEBLUP=γ^iyi+(1γ^i)xiβ^,γ^i=A^A^+Di.\hat{\theta}_i^{\mathrm{EBLUP}} = \hat{\gamma}_i y_i + (1-\hat{\gamma}_i) x_i^\top \hat{\beta}, \qquad \hat{\gamma}_i = \frac{\hat{A}}{\hat{A} + D_i}.

Definition

Prasad-Rao Moment Estimator

The classical Prasad-Rao estimator of AA is

A^PR=y(IPX)ytr((IPX)D)mp,\hat{A}_{\mathrm{PR}} = \frac{y^\top (I - P_X) y - \operatorname{tr}((I - P_X) D)}{m-p},

where PX=X(XX)1XP_X = X(X^\top X)^{-1}X^\top, D=diag(D1,,Dm)D = \operatorname{diag}(D_1,\ldots,D_m), mm is the number of areas, and pp is the number of regression coefficients.

Main Theorem

Theorem

Prasad-Rao Second-Order MSE Decomposition

Statement

For the Fay-Herriot EBLUP based on A^PR\hat{A}_{\mathrm{PR}},

MSE ⁣(θ^iEBLUP)=g1i(A)+g2i(A)+2g3i,PR(A)+o(m1),\operatorname{MSE}\!\left(\hat{\theta}_i^{\mathrm{EBLUP}}\right) = g_{1i}(A) + g_{2i}(A) + 2 g_{3i,\mathrm{PR}}(A) + o(m^{-1}),

where

g1i(A)=ADiA+Di,g_{1i}(A) = \frac{A D_i}{A + D_i},

g2i(A)=(DiA+Di)2xi[u=1mxuxuA+Du]1xi,g_{2i}(A) = \left(\frac{D_i}{A + D_i}\right)^2 x_i^\top \left[\sum_{u=1}^m \frac{x_u x_u^\top}{A + D_u}\right]^{-1} x_i,

and

g3i,PR(A)=2Di2(A+Di)3[1m2u=1m(A+Du)2].g_{3i,\mathrm{PR}}(A) = \frac{2 D_i^2}{(A + D_i)^3} \left[\frac{1}{m^2}\sum_{u=1}^m (A + D_u)^2\right].

The corresponding estimator

MSE^i,PR=g1i(A^PR)+g2i(A^PR)+2g3i,PR(A^PR)\widehat{\operatorname{MSE}}_{i,\mathrm{PR}} = g_{1i}(\hat{A}_{\mathrm{PR}}) + g_{2i}(\hat{A}_{\mathrm{PR}}) + 2 g_{3i,\mathrm{PR}}(\hat{A}_{\mathrm{PR}})

is second-order unbiased.

Intuition

g1g_1 is the irreducible sampling part, g2g_2 is the penalty for estimating β\beta, and g3g_3 is the price of not knowing AA. The classical undercoverage problem comes from pretending g3g_3 is zero.

Proof Sketch

Expand the EBLUP around the oracle BLUP that knows AA. The first two terms of the expansion reproduce the familiar BLUP MSE pieces. A second-order Taylor expansion in A^A\hat{A} - A, together with moment calculations for the Prasad-Rao estimator, yields the extra g3g_3 term. Plugging in A^PR\hat{A}_{\mathrm{PR}} preserves unbiasedness up to order m1m^{-1}.

Why It Matters

This theorem is why serious SAE work does not report the oracle BLUP variance after fitting an EBLUP. Published intervals should acknowledge that the shrinkage parameter itself was estimated.

Failure Mode

The theorem is asymptotic and model-dependent. It assumes the Fay-Herriot structure is correct and DiD_i are treated as known. It is also tied to the classical moment estimator of AA; if AA is estimated by ML or REML, the third correction term changes and should not be mislabeled as Prasad-Rao.

What the Three Terms Mean

TermWhat it measuresVanishes if
g1Sampling noise after shrinkageNever
g2Uncertainty from estimating betaRegression part is known exactly
g3Extra error from estimating AA were known in advance

That table is the page in one glance. The third row is the whole reason this result exists.

Canonical Example

Example

Why the naive interval is too narrow

Suppose a county poverty model uses a Fay-Herriot EBLUP with only a modest number of counties and a variance estimate A^\hat{A} that is itself somewhat unstable. The naive interval computes uncertainty as though that estimated A^\hat{A} were the true population variance. This ignores the fact that a slightly different estimate of AA would change the shrinkage factor γi\gamma_i, and therefore move the area estimate itself.

Prasad-Rao corrects that omission. The point estimate does not change; the reported MSE does. That distinction matters operationally because official statistics release both estimates and measures of precision.

Relation to Later Corrections

The classical Prasad-Rao formula is not the end of the story.

  • If AA is estimated by ML or REML, later work such as Datta-Lahiri modifies the third correction term.
  • If the model is semi-parametric or non-normal, more recent papers study how much of the second-order unbiasedness survives.
  • If benchmarking or other post-processing constraints are imposed, the MSE has to be corrected again.

So this page is the base case: the classical area-level correction, not the whole modern literature.

Common Confusions

Watch Out

Prasad-Rao changes the MSE, not the point estimate

The EBLUP itself is still the same plug-in predictor. The correction applies to the estimated uncertainty, not to the area estimate you publish.

Watch Out

Prasad-Rao is not exact finite-sample truth

It is a second-order approximation. That is much better than the naive formula, but it still depends on asymptotics and on the model being approximately right.

Watch Out

Do not call every EBLUP MSE formula Prasad-Rao

The classical formula is attached to a specific estimator of AA. REML-based or ML-based corrections are related but not identical. Naming them carefully matters because the third term is estimator-specific.

Summary

  • The oracle BLUP MSE is too small once AA is estimated
  • Prasad-Rao decomposes EBLUP MSE into g1g_1, g2g_2, and g3g_3
  • The extra g3g_3 term is the cost of estimating the shrinkage variance
  • The result is second-order, not exact
  • ML and REML versions require related but different corrections

Exercises

ExerciseCore

Problem

Why does an uncertainty formula that treats AA as known usually understate the true EBLUP uncertainty?

ExerciseAdvanced

Problem

A paper estimates AA by REML but reports g1(A^)+g2(A^)+2g3,PR(A^)g_1(\hat{A}) + g_2(\hat{A}) + 2 g_{3,\mathrm{PR}}(\hat{A}) and calls it a Prasad-Rao correction. What is the methodological problem?

References

Canonical:

  • Prasad and Rao, "The Estimation of the Mean Squared Error of Small-Area Estimators" (1990), JASA 85(409), 163-171. Original second-order correction.
  • Rao and Molina, Small Area Estimation, 2nd ed. (2015), Chapter 5. Standard book treatment of Fay-Herriot EBLUP and MSE approximation.
  • Datta and Lahiri, "A Unified Measure of Uncertainty of Estimated Best Linear Unbiased Predictors in Small Area Estimation Problems" (2000), Statistica Sinica 10, 613-627. ML and REML extensions of the correction.
  • Jiang and Lahiri, "Mixed Model Prediction and Small Area Estimation" (2006), TEST 15(1), 1-96. Review of prediction error corrections in mixed models.

Current / practice:

  • Chen, Lahiri, Rao, "Mean Squared Prediction Error Estimators of the Empirical Best Linear Unbiased Predictor of a Small Area Mean Under a Semi-Parametric Fay-Herriot Model" (2025), Survey Methodology. Modern robustness extension.
  • Chambers, Chandra, Tzavidis, "On Bias-Robust Mean Squared Error Estimation for Pseudo-Linear Small Area Estimators" (2011), Survey Methodology. Broader MSE-estimation perspective beyond the simplest linear case.

Next Topics

Last reviewed: April 18, 2026

Prerequisites

Foundations this topic depends on.

Next Topics