Skip to main content

Optimization Function Classes

B-Splines

A numerically stable basis for piecewise polynomials, defined by de Boor's recurrence. Local support, partition-of-unity, banded design matrices, and why every numerical spline implementation uses B-splines rather than the truncated-power basis.

AdvancedAdvancedTier 1StableSupporting~50 min
For:MLStatsResearch

Why This Matters

A piecewise polynomial of degree kk with KK interior knots has K+k+1K + k + 1 degrees of freedom. The natural way to parametrize it is the truncated-power basis {1,x,x2,,xk,(xτ1)+k,,(xτK)+k}\{1, x, x^2, \ldots, x^k, (x - \tau_1)_+^k, \ldots, (x - \tau_K)_+^k\}, where (t)+=max(t,0)(t)_+ = \max(t, 0). This basis is mathematically clean and numerically terrible: the columns of the design matrix are highly correlated (the powers 1,x,x2,x31, x, x^2, x^3 are all nearly collinear on any interval, and the truncated powers add little independent information), the condition number scales exponentially in kk, and at k=10k = 10 the ordinary least squares system is unsolvable in double precision.

B-splines solve all of these problems with a different basis for the same function space. Each B-spline basis function Bj,k(x)B_{j,k}(x) has support on only k+1k + 1 adjacent intervals between knots. The design matrix becomes banded with bandwidth k+1k + 1. The condition number is bounded by a constant depending only on the knot ratios, not on the knot count. Every numerical spline implementation (R's splines::bs, scipy's BSpline, the COBS package, every CAD/CAM system) uses B-splines.

ESL 2nd ed. Appendix to Ch 5 (pp. 186-189) develops the B-spline construction. The canonical book-length treatment is de Boor's A Practical Guide to Splines (2001 revised edition).

Quick Version

PropertyValue
Basis size at degree kk with KK interior knotsK+k+1K + k + 1
Support of Bj,kB_{j, k}[τj,τj+k+1][\tau_j, \tau_{j + k + 1}] (k+1k + 1 intervals)
Partition of unityjBj,k(x)=1\sum_j B_{j, k}(x) = 1 for xx in interior of knot range
NonnegativityBj,k(x)0B_{j, k}(x) \geq 0
ContinuityCk1C^{k-1} at simple knots; CkrC^{k - r} at knots of multiplicity rr
de Boor recurrenceBj,k(x)=xτjτj+kτjBj,k1(x)+τj+k+1xτj+k+1τj+1Bj+1,k1(x)B_{j, k}(x) = \frac{x - \tau_j}{\tau_{j+k} - \tau_j} B_{j, k-1}(x) + \frac{\tau_{j+k+1} - x}{\tau_{j+k+1} - \tau_{j+1}} B_{j+1, k-1}(x)
Design matrix bandwidthk+1k + 1
Condition numberO(1)O(1) in knot count

The recurrence is the source of every numerical algorithm in the family: evaluation, differentiation, knot insertion, knot removal. It is exact in finite-precision arithmetic and free of cancellation.

Formal Setup

Definition

B-Spline of Degree 0

Given an ordered knot sequence τ0<τ1<<τn\tau_0 < \tau_1 < \cdots < \tau_n, the degree-0 B-spline at index jj is the indicator of the interval [τj,τj+1)[\tau_j, \tau_{j+1}): Bj,0(x)=1[τj,τj+1)(x).B_{j, 0}(x) = \mathbf{1}_{[\tau_j, \tau_{j+1})}(x). The set {Bj,0}j=0n1\{B_{j, 0}\}_{j=0}^{n-1} partitions the knot range into disjoint indicators.

Definition

B-Spline of Degree k by de Boor Recurrence

For k1k \geq 1, define Bj,k(x)=xτjτj+kτjBj,k1(x)+τj+k+1xτj+k+1τj+1Bj+1,k1(x).B_{j, k}(x) = \frac{x - \tau_j}{\tau_{j+k} - \tau_j} B_{j, k-1}(x) + \frac{\tau_{j+k+1} - x}{\tau_{j+k+1} - \tau_{j+1}} B_{j+1, k-1}(x). When the knot sequence has KK interior knots plus k+1k + 1 repeated boundary knots at each end (typical), the basis {Bj,k}j=0K+k\{B_{j, k}\}_{j=0}^{K+k} spans the space of piecewise polynomials of degree kk that are Ck1C^{k-1} at every interior knot.

The recurrence is the operational definition. The closed-form B-spline involves divided differences of the truncated-power functions; the recurrence implements the divided-difference computation without catastrophic cancellation.

Properties

Proposition

Partition of Unity for B-Splines

Statement

For xx in the support of the B-spline basis (the interior of the knot range after the boundary repetitions), jBj,k(x)=1.\sum_{j} B_{j, k}(x) = 1. Each Bj,kB_{j, k} is nonnegative.

Intuition

At degree 0 the basis functions are indicator functions of disjoint intervals, so their sum equals 1 inside the knot range. The recurrence preserves the partition-of-unity property because the two convex coefficients in Bj,k(x)B_{j, k}(x) add up to 1 for xx inside the relevant intervals (a routine check).

Partition of unity has a practical consequence: a constant function 11 is exactly representable as jcjBj,k\sum_j c_j B_{j, k} with cj=1c_j = 1. A linear function xx is representable as j(some affine function of τj)Bj,k\sum_j (\text{some affine function of } \tau_j) B_{j, k} because B-splines reproduce polynomials up to degree kk exactly when the control points are placed at the Greville abscissae τj=(τj+1++τj+k)/k\tau^*_j = (\tau_{j+1} + \cdots + \tau_{j+k})/k.

Why It Matters

The partition-of-unity property is what makes B-spline fitting numerically benign. The basis functions are interpolants in disguise: the coefficient cjc_j in cjBj,k\sum c_j B_{j, k} is approximately equal to f(τj)f(\tau^*_j) for smooth ff, so the system matrix is close to the identity rather than to a near-singular Vandermonde matrix.

Failure Mode

Partition of unity fails outside the boundary repetition padding. If the boundary knots are not repeated to multiplicity k+1k + 1, the basis at the boundary is incomplete and jBj,k(x)<1\sum_j B_{j, k}(x) < 1 near the endpoints. The fix is the boundary-padding convention: replicate the boundary knot kk extra times so the basis is complete throughout the knot range.

Optional Proofde Boor's evaluation algorithmShow

ESL 2nd ed. pp. 186-187 and de Boor (2001) Ch IX give the algorithm. To evaluate s(x)=jcjBj,k(x)s(x) = \sum_j c_j B_{j, k}(x) at a query point xx in the interval [τ,τ+1)[\tau_\ell, \tau_{\ell + 1}):

  1. Initialize dj[0]=cjd_j^{[0]} = c_j for j=k,,j = \ell - k, \ldots, \ell.
  2. For r=1,2,,kr = 1, 2, \ldots, k, compute for j=k+r,,j = \ell - k + r, \ldots, \ell: dj[r]=(1αj[r])dj1[r1]+αj[r]dj[r1],αj[r]=xτjτj+kr+1τj.d_j^{[r]} = (1 - \alpha_j^{[r]}) \, d_{j-1}^{[r-1]} + \alpha_j^{[r]} \, d_j^{[r-1]}, \quad \alpha_j^{[r]} = \frac{x - \tau_j}{\tau_{j+k-r+1} - \tau_j}.
  3. Return d[k]=s(x)d_\ell^{[k]} = s(x).

The algorithm is O(k2)O(k^2) regardless of the knot count, evaluates without explicit basis functions, and is numerically stable: every αj[r][0,1]\alpha_j^{[r]} \in [0, 1] in finite-precision arithmetic, so each step is a convex combination and introduces no cancellation.

The same recursion gives derivatives: s(x)=jcj[1]Bj,k1(x)s'(x) = \sum_{j} c_j^{[1]} B_{j, k-1}(x) where cj[1]=k(cjcj1)/(τj+kτj)c_j^{[1]} = k (c_j - c_{j-1}) / (\tau_{j+k} - \tau_j). Higher derivatives chain naturally.

Regression Splines

Use the B-spline basis to fit a piecewise polynomial of degree kk to data (X1,Y1),,(Xn,Yn)(X_1, Y_1), \ldots, (X_n, Y_n): c^=argmincRK+k+1i=1n(YijcjBj,k(Xi))2.\hat{c} = \arg\min_{c \in \mathbb{R}^{K + k + 1}} \sum_{i=1}^n \left(Y_i - \sum_j c_j B_{j, k}(X_i)\right)^2. This is ordinary least squares with design matrix B\boldsymbol{B} having entries Bij=Bj,k(Xi)B_{ij} = B_{j, k}(X_i). The design matrix is banded (each row has at most k+1k + 1 nonzero entries). Normal equations BBc^=BY\boldsymbol{B}^\top \boldsymbol{B} \, \hat{c} = \boldsymbol{B}^\top \boldsymbol{Y} have a banded coefficient matrix and solve in O(nk2)O(n k^2).

Knot placement. The classic recipes:

  • Equal spacing on quantiles of XX: ESL's default. Places knots so that roughly equal numbers of observations fall between adjacent knots.
  • Equal spacing on XX: simpler, fine when the input distribution is roughly uniform.
  • Adaptive (MARS, BARS): choose knots by forward stagewise or Bayesian search. See MARS.

The knot count controls model complexity. AIC, BIC, or cross-validation select it. A common heuristic: start with K=4K = 4 to 77 and add knots until the residual sum of squares stops improving by a meaningful margin.

Why Truncated Powers Fail

The truncated-power basis at degree 3 with knots τ1<<τK\tau_1 < \cdots < \tau_K is {1,x,x2,x3,(xτ1)+3,,(xτK)+3}.\{1, x, x^2, x^3, (x - \tau_1)_+^3, \ldots, (x - \tau_K)_+^3\}. The four powers 1,x,x2,x31, x, x^2, x^3 are nearly linearly dependent on any bounded interval after standardization. The Vandermonde-like structure of BB\boldsymbol{B}^\top \boldsymbol{B} has condition number exponential in degree. At degree 5 with 10 knots, the condition number exceeds 101210^{12} in double precision and the linear solve returns nonsense.

B-splines fix this by replacing {1,x,x2,x3}\{1, x, x^2, x^3\} (a highly correlated local-redundancy basis) with {Bj,3}j=03+K\{B_{j, 3}\}_{j=0}^{3+K} (a local-support basis with constant condition number). Same function space, vastly better numerics. ESL 2nd ed. pp. 186-189 makes this explicit.

Implementation Notes

The standard implementations.

  • R: splines::bs(x, df = ...) constructs the B-spline design matrix with quantile knot placement. The companion splines::ns(x, df = ...) produces a natural cubic spline basis (B-splines with boundary linear-extension constraints). mgcv::s() provides smoothing splines via reduced-rank B-spline bases.
  • Python: scipy.interpolate.BSpline for general B-splines; patsy.dmatrix("bs(x, df=...)") mirrors R's formula syntax.

For numerically demanding work (high kk, many knots), scipy.interpolate implements the de Boor algorithm directly. The pybind-ed FORTRAN version in splines runs at memory-bandwidth-bound speed.

Boundary knot repetition is the most common implementation pitfall: a careful library handles it automatically, but rolling your own basis construction requires explicit padding to multiplicity k+1k + 1 at each boundary. Without that the basis is rank-deficient near the endpoints.

Canonical Example

Example

Cubic regression spline on a sinusoid

Generate n=200n = 200 from XiUniform([0,2π])X_i \sim \mathrm{Uniform}([0, 2\pi]), Yi=sin(Xi)+0.2εiY_i = \sin(X_i) + 0.2 \, \varepsilon_i. Fit a cubic regression spline with KK interior knots placed at equal quantiles.

KKdf\mathrm{df} (K+k+1K + k + 1)residual SDshape
260.32smooth; underfit at peaks
590.22close to optimal
15190.21tracks noise; slight overfit
50540.20obvious overfit

5-fold CV picks K=5K = 5 or 66. AIC and BIC pick similar values. The B-spline design matrix at K=50K = 50 has 50+4=5450 + 4 = 54 columns; the linear solve still takes a few milliseconds because of the bandedness.

The same fit with a truncated-power basis at K=50K = 50: condition number 1014\approx 10^{14}, OLS fails, ridge with a small penalty is required to make the system invertible. The fit is visually identical to the B-spline fit (same function space), but the numerics are an order of magnitude worse.

Common Confusions

Watch Out

B-splines are a basis, not a fitting method

"Fit a B-spline" is shorthand for "fit a piecewise polynomial regression using the B-spline basis". The basis choice is orthogonal to the estimator (OLS, ridge, lasso, smoothing penalty). Smoothing splines and regression splines both use B-splines; they differ in the penalty, not the basis.

Watch Out

The Greville abscissae are not knots

The knots τj\tau_j define the basis. The Greville abscissae τj=(τj+1++τj+k)/k\tau^*_j = (\tau_{j+1} + \cdots + \tau_{j+k})/k are weighted averages of kk consecutive knots and live "where the basis function lives" approximately. They are useful for control-point interpolation in graphics but they are not the breakpoints of the spline.

Watch Out

Knot multiplicity controls smoothness, not interpolation accuracy

Increasing the multiplicity of a knot τj\tau_j from 11 to rr reduces smoothness at τj\tau_j from Ck1C^{k-1} to CkrC^{k-r}. The function space becomes strictly larger; the design matrix gains r1r - 1 extra columns near τj\tau_j. Use multiplicity to introduce intentional kinks or jumps; do not use it to "improve" a smooth fit.

Exercises

ExerciseCore

Problem

Verify the de Boor recurrence for k=1k = 1: starting from Bj,0(x)=1[τj,τj+1)(x)B_{j, 0}(x) = \mathbf{1}_{[\tau_j, \tau_{j+1})}(x), compute Bj,1(x)B_{j, 1}(x) explicitly and confirm it is the standard linear-tent basis function peaking at τj+1\tau_{j+1}.

ExerciseAdvanced

Problem

Show that for a uniform knot sequence τj=j\tau_j = j and degree kk, the B-spline B0,k(x)B_{0, k}(x) equals the (k+1)(k+1)-fold convolution of the indicator 1[0,1)\mathbf{1}_{[0, 1)} with itself. Hence relate B-splines to the densities of sums of independent uniform random variables.

ExerciseResearch

Problem

Construct a B-spline basis on a sphere S2S^2 with locally adaptive knot density. Discuss the obstructions: there is no canonical "knot sequence" on a manifold, no canonical boundary, and the partition-of-unity property must be enforced by hand via a finite cover and a smooth partition.

References

Canonical:

  • de Boor, C. (2001). A Practical Guide to Splines (revised ed.). Springer Applied Mathematical Sciences, vol. 27. The definitive reference; covers evaluation, knot insertion, B-spline arithmetic, and rigorous proofs.
  • Hastie, Tibshirani, Friedman. The Elements of Statistical Learning, 2nd ed. Springer (2009). Ch 5 "Basis Expansions and Regularization", §5.2 "Piecewise Polynomials and Splines" (pp. 141-148), Appendix on B-splines (pp. 186-189). The statistical-learning view.
  • Green, P. J. and Silverman, B. W. (1994). Nonparametric Regression and Generalized Linear Models. Chapman and Hall. Connects B-splines to the smoothing-spline penalty framework.

Foundational:

  • Schoenberg, I. J. (1946). "Contributions to the Problem of Approximation of Equidistant Data by Analytic Functions." Quarterly of Applied Mathematics 4, 45-99, 112-141. Original B-spline construction (for uniform knots).
  • Curry, H. B. and Schoenberg, I. J. (1966). "On Pólya Frequency Functions IV: The Fundamental Spline Functions and Their Limits." Journal d'Analyse Mathématique 17, 71-107. Non-uniform-knot extension.

Numerical analysis:

  • Lyche, T. and Mørken, K. (2008). Spline Methods. University of Oslo lecture notes. Modern numerical treatment with linear-algebra emphasis.
  • Piegl, L. and Tiller, W. (1997). The NURBS Book (2nd ed.). Springer. B-splines in the CAD/CAM setting; the standard reference for non-uniform rational B-splines and knot insertion.

Next Topics

  • Smoothing splines: penalized regression with B-spline basis and a roughness penalty.
  • Thin-plate splines: the multivariate generalization with a radial basis representation.
  • Generalized additive models: use B-splines as per-coordinate basis functions in a backfitting framework.
  • MARS: adaptive knot selection on B-spline-like bases.

Last reviewed: May 13, 2026

Canonical graph

Required before and derived from this topic

These links come from prerequisite edges in the curriculum graph. Editorial suggestions are shown here only when the target page also cites this page as a prerequisite.

Required prerequisites

3

Derived topics

0

No published topic currently declares this as a prerequisite.