Beta. Content is under active construction and has not been peer-reviewed. Report errors on GitHub.Disclaimer

Numerical Optimization

Interior Point Methods

Barrier functions transform constrained optimization into unconstrained problems. Newton steps on the barrier objective trace the central path to the constrained optimum with polynomial convergence.

AdvancedTier 3Stable~50 min
0

Why This Matters

Many optimization problems in ML and statistics involve constraints: positive-definiteness of covariance matrices (semidefinite programs), probability simplex constraints, box constraints on parameters. The simplex method for linear programming walks along the boundary of the feasible region and can take exponentially many steps in the worst case. Interior point methods take a different approach: they stay strictly inside the feasible region, using a barrier function to repel iterates from constraint boundaries. The result is polynomial-time convergence for LPs, quadratic programs, and semidefinite programs. Interior point methods are the standard solver backend for large-scale convex optimization.

Mental Model

Imagine trying to minimize a function inside a room without touching the walls. A barrier function adds a penalty that goes to infinity as you approach any wall. The penalty keeps you away from the boundaries. By gradually reducing the strength of the penalty, you allow the iterates to approach the boundary (where the constrained optimum typically lies) in a controlled manner. Newton's method on the penalized objective gives fast local convergence at each penalty level.

Formal Setup

Consider the constrained convex optimization problem:

minxf0(x)subject tofi(x)0,i=1,,m\min_{x} f_0(x) \quad \text{subject to} \quad f_i(x) \leq 0, \quad i = 1, \ldots, m

where f0,f1,,fmf_0, f_1, \ldots, f_m are convex and twice differentiable.

Definition

Logarithmic Barrier Function

The logarithmic barrier for the inequality constraints is:

ϕ(x)=i=1mlog(fi(x))\phi(x) = -\sum_{i=1}^{m} \log(-f_i(x))

This function is defined only for strictly feasible points (where fi(x)<0f_i(x) < 0 for all ii). It approaches ++\infty as any constraint approaches equality. The gradient ϕ(x)=i=1m1fi(x)fi(x)\nabla \phi(x) = \sum_{i=1}^{m} \frac{1}{-f_i(x)} \nabla f_i(x) pushes the iterate away from constraint boundaries.

Definition

Barrier Problem

For a parameter t>0t > 0, the barrier problem is:

minx  tf0(x)+ϕ(x)\min_{x} \; t \cdot f_0(x) + \phi(x)

As tt \to \infty, the barrier term ϕ(x)\phi(x) becomes negligible relative to tf0(x)t \cdot f_0(x), and the solution of the barrier problem approaches the solution of the original constrained problem.

Definition

Central Path

The central path is the curve of solutions {x(t):t>0}\{x^*(t) : t > 0\} where x(t)x^*(t) is the minimizer of tf0(x)+ϕ(x)t \cdot f_0(x) + \phi(x). As tt increases, x(t)x^*(t) traces a smooth path from the analytic center of the feasible region (at t=0t = 0) toward the constrained optimum (as tt \to \infty).

The Barrier Method Algorithm

The barrier method repeatedly solves the barrier problem for increasing values of tt:

  1. Initialize with a strictly feasible point x(0)x^{(0)}, parameter t>0t > 0, and growth factor μ>1\mu > 1
  2. Centering step: starting from the current iterate, run Newton's method on tf0(x)+ϕ(x)t \cdot f_0(x) + \phi(x) until convergence to x(t)x^*(t)
  3. Update: set tμtt \leftarrow \mu \cdot t
  4. Stop when m/tϵm/t \leq \epsilon (duality gap bound)

Each centering step is an unconstrained minimization (Newton's method on a smooth, self-concordant function). The outer loop increases tt, tightening the approximation to the original problem.

Core Theory

Theorem

Convergence of the Barrier Method

Statement

The barrier method produces a point xx with f0(x)f0(x)ϵf_0(x) - f_0(x^*) \leq \epsilon after at most:

log(m/(t(0)ϵ))logμ\left\lceil \frac{\log(m / (t^{(0)} \epsilon))}{\log \mu} \right\rceil

outer iterations, where mm is the number of inequality constraints, t(0)t^{(0)} is the initial barrier parameter, and μ>1\mu > 1 is the growth factor. Each outer iteration requires O(m)O(\sqrt{m}) Newton steps if μ=1+1/m\mu = 1 + 1/\sqrt{m} (the theoretically optimal growth rate). The total number of Newton steps is O(mlog(m/ϵ))O(\sqrt{m} \log(m/\epsilon)).

Intuition

At any point on the central path with parameter tt, the suboptimality is at most m/tm/t. To reach suboptimality ϵ\epsilon, we need tm/ϵt \geq m/\epsilon. Starting from t(0)t^{(0)} and multiplying by μ\mu each iteration, we reach m/ϵm/\epsilon in logμ(m/(t(0)ϵ))\log_\mu(m/(t^{(0)}\epsilon)) iterations. Each centering step takes few Newton steps because x(t)x^*(t) is a warm start for x(μt)x^*(\mu t) (the central path is smooth).

Proof Sketch

The duality gap at x(t)x^*(t) is exactly m/tm/t (from the KKT conditions of the barrier problem: the dual variables are λi=1/(t(fi(x(t))))\lambda_i^* = 1/(t(-f_i(x^*(t)))), and strong duality gives the gap). So f0(x(t))f0(x)m/tf_0(x^*(t)) - f_0(x^*) \leq m/t. For the Newton step count: the barrier function ϕ\phi is self-concordant (Nesterov and Nemirovski, 1994), and Newton's method on self-concordant functions converges in O(1)O(1) iterations from a warm start when μ\mu is moderate. Choosing μ=1+1/m\mu = 1 + 1/\sqrt{m} balances outer iterations against Newton steps per iteration.

Why It Matters

This gives polynomial-time convergence for any convex program with a self-concordant barrier. For linear programs with mm constraints and nn variables, the total complexity is O(mn3log(m/ϵ))O(\sqrt{m}\, n^3 \log(m/\epsilon)) (the n3n^3 comes from solving the Newton system). This is polynomial, unlike the simplex method which can take 2n2^n steps in worst-case examples (Klee-Minty).

Failure Mode

The method requires a strictly feasible starting point. Finding one can be as hard as solving the original problem (Phase I methods address this). The Newton system at each step requires solving an n×nn \times n linear system, which costs O(n3)O(n^3) in general. For very large nn, this is the bottleneck, and iterative solvers or exploiting problem structure is necessary. Also, the barrier function is not defined at the boundary, so the method cannot handle equality constraints directly (they must be eliminated or handled separately).

Why Interior Point Methods Dominate

Linear programming. The simplex method is fast in practice (polynomial on average) but exponential in the worst case. Interior point methods guarantee O(mlog(1/ϵ))O(\sqrt{m} \log(1/\epsilon)) iterations. For large, sparse LPs, interior point methods are competitive with or faster than simplex.

Semidefinite programming (SDP). There is no analog of the simplex method for SDPs. Interior point methods, using the log-determinant barrier ϕ(X)=logdet(X)\phi(X) = -\log \det(X) for positive semidefinite matrix constraints, are the primary solution method. Many ML problems (kernel learning, robust PCA, relaxations of combinatorial problems) reduce to SDPs.

Second-order cone programming (SOCP). Interior point methods handle second-order cone constraints Ax+bcx+d\|Ax + b\| \leq c^\top x + d with the same framework, using appropriate self-concordant barriers.

Connection to KKT Conditions

The centering step computes a point where:

tf0(x)+i=1m1fi(x)fi(x)=0t \nabla f_0(x) + \sum_{i=1}^{m} \frac{1}{-f_i(x)} \nabla f_i(x) = 0

Defining λi=1/(t(fi(x)))\lambda_i = 1/(t(-f_i(x))), this becomes f0(x)+iλifi(x)=0\nabla f_0(x) + \sum_i \lambda_i \nabla f_i(x) = 0 with λi>0\lambda_i > 0 and λifi(x)=1/t\lambda_i f_i(x) = -1/t. As tt \to \infty, λifi(x)0\lambda_i f_i(x) \to 0, approaching complementary slackness. The barrier method traces a smooth path toward the KKT conditions.

Common Confusions

Watch Out

The barrier parameter t is not a step size

The parameter tt controls the fidelity of the barrier approximation, not the Newton step size. Increasing tt makes the barrier problem closer to the original constrained problem. Newton's method has its own step size (determined by line search) within each centering step.

Watch Out

Interior point methods do not find vertices

For linear programs, the simplex method finds a vertex (extreme point) of the feasible polyhedron. Interior point methods find a point in the interior, near but not exactly at a vertex. If an exact vertex solution is needed (e.g., for integer programming), a crossover procedure can be applied to the interior point solution.

Key Takeaways

  • The log barrier log(fi(x))-\sum \log(-f_i(x)) keeps iterates strictly feasible while penalizing proximity to boundaries
  • The central path traces a smooth curve from the analytic center to the constrained optimum as tt \to \infty
  • Each centering step uses Newton's method on the barrier objective
  • Total complexity is O(mlog(m/ϵ))O(\sqrt{m} \log(m/\epsilon)) Newton steps for mm constraints
  • Interior point methods provide polynomial convergence for LPs, SDPs, and SOCPs
  • A strictly feasible starting point is required; Phase I methods find one if it exists

Exercises

ExerciseCore

Problem

Consider the 1D problem: minimize xx subject to x1x \geq 1 and x5x \leq 5. Write the barrier function ϕ(x)\phi(x). For t=10t = 10, find the minimizer x(t)x^*(t) of the barrier problem tx+ϕ(x)t \cdot x + \phi(x).

ExerciseAdvanced

Problem

Show that for a linear program min{cx:Axb}\min\{c^\top x : Ax \leq b\} with mm constraints, the duality gap at the point x(t)x^*(t) on the central path is exactly m/tm/t. Derive the dual variables λi\lambda_i^* from the barrier optimality conditions.

References

Canonical:

  • Boyd and Vandenberghe, Convex Optimization (2004), Chapter 11
  • Nesterov and Nemirovski, Interior-Point Polynomial Algorithms in Convex Programming (1994)

Current:

  • Nocedal and Wright, Numerical Optimization (2006), Chapter 14
  • Wright, Primal-Dual Interior-Point Methods (1997)

Next Topics

  • Primal-dual interior point methods and Mehrotra predictor-corrector
  • Semidefinite programming and applications in ML

Last reviewed: April 2026

Prerequisites

Foundations this topic depends on.