Statistics
Delta Method
Asymptotic distribution of a smooth function of an estimator. If sqrt(n)(T_n - mu) converges to N(0, sigma^2), then sqrt(n)(g(T_n) - g(mu)) converges to N(0, [g'(mu)]^2 sigma^2). The multivariate version uses the Jacobian; the second-order version handles vanishing derivatives. The page derives the result, works three canonical examples (variance of a log proportion, variance of a ratio of means, asymptotic variance of the sample correlation), and ties the construction to variance-stabilizing transformations.
Prerequisites
Why This Matters
Every standard error of a smooth statistic comes from the delta method. The variance of , the variance of , the asymptotic variance of the sample correlation, the standard error of a fitted odds ratio: each is one Taylor expansion away from a central-limit-theorem statement.
The result is short. If and is differentiable at with , then . The proof is one Taylor expansion plus Slutsky's theorem. The applications are everywhere.
Univariate Statement
Delta Method (univariate)
Statement
Let be a sequence of random variables with If is differentiable at and , then
Intuition
Near , the smooth function is approximately linear with slope . A linear transform of an approximately normal random variable is approximately normal, with variance multiplied by the square of the slope.
Proof Sketch
Write with by differentiability. Multiply by : The first term converges in distribution to by the continuous mapping theorem applied to multiplication by the constant . For the remainder, , so and . Slutsky's theorem absorbs the remainder.
Why It Matters
This single statement gives the standard error of any plug-in estimator that is a smooth function of a CLT-rate estimator. The pattern is: write the estimator as applied to a sample mean, identify and , compute , and read off the asymptotic variance.
Failure Mode
The delta method fails or needs adjustment when (use the second-order version below), when is not differentiable at (the limit may not be normal; e.g., at gives a folded normal), or when converges at a rate other than (the same expansion holds but with that rate replacing ).
Multivariate Statement
Delta Method (multivariate)
Statement
Let satisfy and let be differentiable at with Jacobian Then
Intuition
The Jacobian is the multivariate analog of . The push-forward of a Gaussian through a linear map is again Gaussian with covariance . The delta method says: replace the nonlinear by its linearization , then apply the push-forward rule.
Proof Sketch
Vector Taylor expansion: with . Multiply by : The first term converges to because is a deterministic matrix. The remainder is by the same -rate argument as in the univariate case. Slutsky finishes.
Why It Matters
The multivariate version is what makes the delta method useful in practice. Most interesting statistics are functions of multiple sample moments: a sample correlation is a function of three sample averages, a ratio is a function of two, a likelihood ratio is a function of many. Compute the Jacobian, sandwich the covariance, and you have the asymptotic variance.
Failure Mode
If has a zero row, the corresponding component of converges at rate faster than and its limit must be analyzed separately (second-order). If is rank-deficient, the limit normal is degenerate on a lower-dimensional subspace; this still holds, but interpret with care.
Second-Order Version
Delta Method (second-order)
Statement
If , is twice differentiable at , , and , then The convergence rate is , not , and the limit is a scaled chi-squared with one degree of freedom, not a normal.
Intuition
When the gradient vanishes, the linear term in the Taylor expansion is zero and the leading behavior is quadratic. Squaring a centered normal produces a , and the rate doubles from to because the squared deviation is of order rather than .
Proof Sketch
Taylor: since . Multiply by : By the continuous mapping theorem applied to , . Slutsky absorbs the .
Why It Matters
The second-order version is the right tool whenever the parameter sits at a critical point of the function being studied. The canonical example is variance estimation at the boundary: if estimates and you study at , the linear term vanishes and the limit is .
Failure Mode
If both and vanish, the rate accelerates further and the limit involves higher derivatives. If is only once differentiable at , the second-order expansion does not exist and a different argument is needed.
Worked Example 1: Variance of a Log Sample Proportion
Let where independently with . The CLT gives Take . Then . The univariate delta method gives The asymptotic standard error of is therefore . Notice that the variance is unbounded as : estimating is unstable for rare events, which is exactly the regime where this expression is most often used.
Worked Example 2: Ratio of Two Means
Suppose are iid with , , , , . The bivariate CLT gives Let , so and the gradient is The multivariate delta method gives where The second factor is the squared coefficient of variation of the ratio. This is the standard ratio-estimator variance used in survey sampling.
Worked Example 3: Asymptotic Variance of the Sample Correlation
Let be iid bivariate with finite fourth moments, (without loss of generality after centering), variances , and correlation . The sample correlation is Define the vector of sample moments with mean where . Write so and .
Computing the partial derivatives at and applying the multivariate delta method to the joint CLT for , under the standard assumption that and are jointly normal, one finds This explains Fisher's -transformation: has , which cancels the factor in the asymptotic standard error and yields a limit variance of . The Fisher transformation is the variance-stabilizing transformation for the sample correlation under bivariate normality. See variance-stabilizing transformations for the construction.
Tie to Variance-Stabilizing Transformations
The delta method gives . If depends on (as it does for Poisson, Binomial proportions, and many other one-parameter families), the variance of the raw statistic varies with . Picking so that is constant in removes the dependence. Solving the ODE gives . The Poisson square-root transform, the binomial arcsin-square-root transform, and the Fisher correlation transform all come from this construction.
Common Confusions
The delta method is about variance, not bias
The first-order delta method gives the asymptotic distribution of around , not the exact mean of . By Jensen's inequality, in general; the gap is of order and shows up in second-order bias corrections (Edgeworth expansion territory), not in the leading-order normality statement.
Vanishing derivative changes the rate, not just the variance
If , do not just set the asymptotic variance to zero and call it done. The leading behavior becomes quadratic, the rate is instead of , and the limit is chi-squared, not normal. Apply the second-order version.
The CLT rate is not always sqrt n
Some estimators converge faster (e.g., MLE for the boundary of a uniform converges at rate ) or slower (e.g., nonparametric density estimation at rate ). Use whatever rate the underlying CLT gives, not a reflex .
Plug-in standard errors estimate g prime of mu hat, not g prime of mu
In practice is unknown and the asymptotic variance is estimated by . This is consistent under continuity of at and convergence , both standard. The substitution is valid because Slutsky lets you replace consistent estimators inside convergence-in-distribution statements.
Exercises
Problem
Let independently. The MLE of is . Compute the asymptotic distribution of using the delta method.
Problem
Let be the sample proportion in independent Bernoulli() trials. Find the asymptotic distribution of , the plug-in estimator of the Bernoulli variance.
Problem
Derive the asymptotic distribution of for iid pairs with mean , , and joint covariance matrix .
References
Canonical:
- Casella and Berger, Statistical Inference (2002), 2nd edition, Sections 5.5.4 and 10.1.6
- van der Vaart, Asymptotic Statistics (1998), Chapter 3
- Lehmann and Romano, Testing Statistical Hypotheses (2005), 3rd edition, Sections 11.2 and 14.1
Applications and variance-stabilizing transformations:
- Bickel and Doksum, Mathematical Statistics: Basic Ideas and Selected Topics, Volume I (2015), 2nd edition, Section 5.3
- Cox and Hinkley, Theoretical Statistics (1974), Section 9.2
- Efron and Tibshirani, An Introduction to the Bootstrap (1993), Chapter 4 (parametric standard errors and the bootstrap alternative)
Next Topics
- Variance-stabilizing transformations: how the delta-method ODE gives the Poisson square-root, binomial arcsin, and Fisher correlation transforms.
- Maximum likelihood estimation: standard errors of MLEs are delta-method consequences of the score-function CLT.
- Bootstrap methods: an alternative when analytic delta-method variances are intractable.
Last reviewed: May 12, 2026
Canonical graph
Required before and derived from this topic
These links come from prerequisite edges in the curriculum graph. Editorial suggestions are shown here only when the target page also cites this page as a prerequisite.
Required prerequisites
4- Expectation, Variance, Covariance, and Momentslayer 0A · tier 1
- Asymptotic Statistics: M-Estimators, Delta Method, LANlayer 0B · tier 1
- Central Limit Theoremlayer 0B · tier 1
- Modes of Convergence of Random Variableslayer 0B · tier 1
Derived topics
1- Variance-Stabilizing Transformationslayer 1 · tier 1
Graph-backed continuations