Optimization Function Classes
Thin-Plate Splines
Smoothing splines in two and higher dimensions. Penalize integrated squared second-derivative magnitude across the surface; the minimizer is a sum of radial basis functions plus a low-degree polynomial. Green-Silverman 1994 and Wahba 1990 are the canonical references.
Why This Matters
Smoothing splines in one dimension penalize and have a clean natural-cubic-spline minimizer. Generalizing to dimension runs into a problem: a "second derivative" of a function on is a matrix, and there is no single natural scalar to put inside the integral.
Thin-plate splines pick a specific scalar: the Frobenius norm of the Hessian, integrated over the input space, For this is the "bending energy" of a thin metal plate deflecting under loads at the data points, hence the name. Duchon (1977) showed that the minimizer of over an appropriate Beppo Levi space is a finite sum of radial basis functions plus a low-degree polynomial. The representation has free parameters, just like the univariate smoothing spline.
ESL 2nd ed. §5.7 (pp. 162-167) introduces thin-plate splines as the canonical multidimensional smoother. Green and Silverman (1994) Ch 7 develops the full theory; Wahba (1990) Ch 2 gives the RKHS view.
Quick Version
| Object | Form |
|---|---|
| Penalty in | |
| Radial basis function | for ; for higher with appropriate |
| Solution | |
| Side conditions | for each polynomial in the null space |
| Linear system size | |
| Null space (, ) | constants, , (dimension 3) |
| Optimal | ; rate in MSE |
The function is the fundamental solution of the biharmonic equation in (Green's function for ). The basis at each is the response of a thin plate to a point load at .
Formal Setup
Bending Energy in 2D
For a function with square-integrable second derivatives, is invariant under rotations and translations of the coordinate system: rotating the input rotates the Hessian without changing the Frobenius norm. if and only if is affine, namely .
Thin-Plate Spline
Given with and , the thin-plate spline is the minimizer of over functions in the Beppo Levi space modulo the affine null space.
The Beppo Levi space is the natural domain: it identifies functions that differ by an affine function, since does not see affine perturbations. The minimizer is unique up to the affine null space; the data-fit term pins it down.
The Representer Theorem
Thin-Plate Spline Representation (Duchon 1977)
Statement
The minimizer of over the appropriate Beppo Levi space has the form where is a basis for the null space of (polynomials of degree ) and is the fundamental solution of the iterated Laplacian :
- , :
- , :
- , : (with sign adjustments)
- general and with and not even:
- general and with even and nonnegative: .
The coefficients solve a linear system of size . The side conditions for each in the null-space basis ensure that the radial-basis part has no polynomial component to absorb.
Intuition
The penalty has a null space (the polynomial part) and a positive part (everything else). On the positive part, defines an RKHS norm, and the kernel is the Green's function of the differential operator . The representer theorem in this RKHS then gives the radial basis expansion. The null-space polynomials add separately because the penalty does not penalize them.
The function in 2D is the response of a thin plate to a point load: it blows up logarithmically and grows quadratically. The smoothing-spline solution at each data point reads off the strength of the load needed to produce the observed deflection.
Why It Matters
This is the unique generalization of the univariate cubic smoothing spline that respects rotation invariance and gives a tractable finite-dimensional solution. The representation extends to higher dimensions and to higher penalty orders () without any new ideas. The only practical issue is that the linear system is dense (no banding), so the naive cost is . Low-rank approximations bring this back to for some moderate .
Failure Mode
Three failure modes. (i) : the penalty has insufficient smoothness to produce a well-defined function. For you need . (ii) Repeated : the linear system is singular. Pre-merge replicates. (iii) Inputs near a low-dimensional manifold: the radial basis matrix becomes near-singular and the coefficients blow up. The numerical cure is either reduced-rank thin plates (Wood, 2003) or explicit regularization on .
Optional ProofWhy r squared log r is the right radial basis in 2DShow
Green and Silverman (1994) Ch 7 and Wahba (1990) Ch 2 work this out.
The biharmonic operator in 2D acts on smooth functions. Its fundamental solution satisfies (the Dirac mass at origin) in the distributional sense. By rotation invariance, for , and the equation reduces to an ODE: .
Solving with the ansatz gives , so the homogeneous solutions are . Subtracting the homogeneous parts and matching the delta-function source gives as the fundamental solution. The normalization constant gets absorbed into the coefficients in the representer theorem; the functional form is the structural point.
The penalty can be written as a quadratic form on the (after using the representer theorem and integrating by parts): where . This is the basis-matrix-as-Gram-matrix identity that makes the representer theorem operational.
Implementation Notes
The straightforward implementation:
- Build the matrix with .
- Build the null-space matrix with rows , where is a basis for null.
- Solve the saddle-point system
The cost is for the dense linear solve.
Low-rank approximation (Wood, 2003). Replace with
its rank- approximation
from the leading eigenvectors. The resulting "thin-plate regression
spline" has parameters instead of and solves in .
This is the default in mgcv::s(x1, x2, bs = "tp") in R. For typical
applications to gives accuracy indistinguishable from the
full thin-plate solution.
Smoothing parameter. GCV from smoothing splines applies directly; the smoother matrix is dense rather than banded but the trace can be computed via the eigenvalues of .
Higher Dimensions
For with , the fundamental solution of is (linear in the radius). For , it is . For , it is . The general formula is
with appropriate sign conventions. The null space is polynomials of degree in variables, dimension .
For high the null-space dimension grows polynomially and the estimator inherits the curse of dimensionality: MSE rate degrades sharply with . By to the estimator is largely useless without further structure (additivity, sparsity, low intrinsic dimension).
Canonical Example
A geological surface from sparse measurements
Imagine elevation measurements at irregularly placed survey points across a region. Fit a thin-plate spline with to produce a smooth surface.
| Visual outcome | ||
|---|---|---|
| very small | interpolation; passes through every survey point, wild oscillation between | |
| GCV-optimal | smooth surface; survey points slightly off the surface but visibly the right shape | |
| very large | best-fit affine plane; loses topography |
The GCV-optimal fit recovers the main ridge structure cleanly. The fitted surface satisfies in physical units of inverse length squared, which is interpretable as "the surface is mostly flat with some moderate curvature near the ridge". The same data fit by ordinary kriging with a Matérn covariance gives a visually similar surface; the thin-plate spline is the limit of kriging with an improper "intrinsic stationary" prior.
Common Confusions
Thin-plate splines are not kernel ridge regression with a fixed kernel
The radial basis is the Green's function of the differential operator, not a Mercer kernel. It is conditionally positive definite but not positive definite outright: it has negative eigenvalues. The representer theorem still applies because the null-space part is added separately; the result is not a clean kernel ridge regression but the structure is analogous. ESL 2nd ed. p. 165 makes this distinction.
The bending-energy interpretation is for d = 2 specifically
"Thin-plate" is the , case where the penalty equals the elastic-energy of a deflected metal plate. The generalization to other is the same machinery but the physical interpretation breaks down. Use "thin-plate spline" loosely for any radial basis built on a Green's function of in ; the canonical case is .
Tensor-product splines are a different choice in higher dimensions
Thin-plate splines are isotropic: rotation-invariant. Tensor-product splines build a basis as a product of one-dimensional B-spline bases along each coordinate. They are anisotropic and have basis functions for knots per dimension. Thin-plate is the right choice when the data has no preferred direction; tensor-product is the right choice when the coordinates are heterogeneous (one is time, another is a spatial dimension, say) or when you want per-coordinate degrees of freedom.
Exercises
Problem
Verify that for . Hence confirm the null space of in 2D, has dimension .
Problem
Show that the side condition for each polynomial in the null space follows from the requirement that the radial-basis part of has no polynomial component to absorb. Equivalently, that the representer theorem's representation is unique modulo the null space.
Problem
For , with measurements on a regular grid of side , the linear system has condition number that scales with . Estimate the rate and propose a preconditioner.
References
Canonical:
- Green, P. J. and Silverman, B. W. (1994). Nonparametric Regression and Generalized Linear Models. Chapman and Hall. Ch 7 "Thin Plate Splines in Two Dimensions". The textbook treatment with full derivations.
- Wahba, G. (1990). Spline Models for Observational Data. SIAM. Ch 2 "More General Reproducing Kernel Hilbert Spaces", Ch 3 "Equivalence and Perpendicularity, or, What's So Special About Splines?" The RKHS / Bayesian view.
- Hastie, Tibshirani, Friedman. The Elements of Statistical Learning, 2nd ed. Springer (2009). Ch 5 "Basis Expansions and Regularization", §5.7 "Multidimensional Splines" (pp. 162-167). Concise statistical-learning summary.
Foundational:
- Duchon, J. (1977). "Splines Minimizing Rotation-Invariant Semi-Norms in Sobolev Spaces." In Constructive Theory of Functions of Several Variables, Lecture Notes in Mathematics 571, Springer, 85-100. The original construction and the proof of the representer theorem in the Beppo Levi setting.
- Meinguet, J. (1979). "Multivariate Interpolation at Arbitrary Points Made Simple." Journal of Applied Mathematics and Physics (ZAMP) 30(2), 292-304. The Green's-function derivation of the radial basis.
Low-rank and computation:
- Wood, S. N. (2003). "Thin Plate Regression Splines." Journal of the Royal Statistical Society B 65(1), 95-114. The reduced-rank approximation used in
mgcvand most modern implementations. - Wendland, H. (2004). Scattered Data Approximation. Cambridge. Numerical analysis of radial basis function methods.
Bayesian / geostatistical connection:
- Cressie, N. (1993). Statistics for Spatial Data. Wiley. Thin-plate splines as kriging with an intrinsic stationary prior.
Next Topics
- Smoothing splines: the univariate predecessor; thin-plate is the generalization.
- B-splines: the alternative basis for tensor-product multidimensional splines.
- Gaussian processes regression: the Bayesian counterpart; thin-plate splines are the posterior mean under an improper "bending energy" prior.
- Generalized additive models: per-coordinate smoothers as an alternative to thin-plate when interactions are not the target.
Last reviewed: May 13, 2026
Canonical graph
Required before and derived from this topic
These links come from prerequisite edges in the curriculum graph. Editorial suggestions are shown here only when the target page also cites this page as a prerequisite.
Required prerequisites
4- Linear Regressionlayer 1 · tier 1
- Ridge Regressionlayer 1 · tier 1
- Smoothing Splineslayer 2 · tier 1
- Kernels and Reproducing Kernel Hilbert Spaceslayer 3 · tier 2
Derived topics
0No published topic currently declares this as a prerequisite.