Foundations
Inverse and Implicit Function Theorem
The inverse function theorem guarantees local invertibility when the Jacobian is nonsingular. The implicit function theorem guarantees that constraint surfaces are locally graphs. Both are essential for constrained optimization and implicit layers.
Prerequisites
Why This Matters
The inverse function theorem tells you when a nonlinear function can be "undone" locally. The implicit function theorem tells you when a system of equations defines as a function of , even when you cannot solve for explicitly. These theorems appear throughout ML: Lagrange multipliers for constrained optimization, implicit differentiation through fixed-point layers (deep equilibrium models), and the reparameterization trick in variational inference.
The Inverse Function Theorem
Local Diffeomorphism
A continuously differentiable map is a local diffeomorphism at if there exist open sets and such that is bijective and both and are continuously differentiable on their respective domains.
Inverse Function Theorem
Statement
Let be continuously differentiable () in a neighborhood of . If the Jacobian matrix is invertible (i.e., ), then is a local diffeomorphism at . That is, there exist open sets and such that:
- is a bijection
- is continuously differentiable
The derivative of the inverse is the inverse of the derivative.
Intuition
If the linear approximation is invertible, then the nonlinear function behaves like an invertible linear map near . You can "undo" in a small neighborhood. The key word is local: the function may not be globally invertible (think of near , which is locally invertible even though is not globally injective).
Proof Sketch
The standard proof uses the contraction mapping theorem. Define and show that is a contraction on a small ball around when is close to . The Banach fixed-point theorem gives a unique fixed point . Differentiability of follows from the chain rule applied to .
Why It Matters
In ML, this theorem justifies: (1) the change-of-variables formula in normalizing flows, where you need the transformation to be locally invertible, (2) implicit differentiation through nonlinear layers, and (3) the fact that smooth bijective reparameterizations preserve optimization landscapes locally.
Failure Mode
The theorem is silent when is singular. At such points, the function may fold (like at ), and there is no local inverse. The theorem also says nothing about the size of the neighborhood ; it could be extremely small.
The Implicit Function Theorem
Implicit Function Theorem
Statement
Let be in a neighborhood of with . If the matrix is invertible, then there exist open sets and and a unique function such that:
- for all
The equation implicitly defines near .
Intuition
If you have a system of equations in unknowns and the equations are "non-degenerate" with respect to the last variables (the Jacobian in is invertible), then you can solve for those variables as smooth functions of the remaining variables. You do not need an explicit formula; the theorem guarantees the function exists and gives its derivative.
Proof Sketch
Define from to . The Jacobian of at is block lower triangular with blocks and , so . Apply the inverse function theorem to to get a local inverse, then extract from it.
Why It Matters
The implicit function theorem underlies: (1) Lagrange multipliers, where constraints implicitly define a manifold of feasible points, (2) implicit differentiation in deep equilibrium models (DEQs), where the output satisfies and you differentiate through the fixed-point equation, and (3) the computation of gradients through any layer defined by a fixed-point or root-finding condition.
Failure Mode
The theorem fails when is singular. At such points, the solution set of may branch, have cusps, or be locally degenerate. The theorem also provides only local results. The implicit function may not extend to a global solution.
Applications in ML
Lagrange Multipliers
To minimize subject to where , the method of Lagrange multipliers introduces the Lagrangian:
At a constrained minimum, and . The implicit function theorem guarantees that the constraint locally defines a -dimensional manifold, provided has full rank.
Deep Equilibrium Models
A deep equilibrium model defines its output as the fixed point of . To compute , define . By the implicit function theorem (assuming is invertible):
This avoids backpropagating through the entire fixed-point iteration.
Common Confusions
Local does not mean global
Both theorems give local results only. The inverse function theorem says is invertible near , not everywhere. The map from to the unit circle has invertible derivative everywhere but is not globally injective. Always check whether the local guarantee suffices for your application.
Invertible Jacobian is necessary at the point, not everywhere
The inverse function theorem requires at the specific point . The Jacobian may be singular at other points. Similarly, the implicit function theorem needs invertible at the specific point .
Summary
- Inverse function theorem: if is invertible, then is locally invertible near
- The derivative of the local inverse is
- Implicit function theorem: if and is invertible, then near
- The implicit derivative is
- Both theorems are local; they say nothing about global invertibility
- Applications: Lagrange multipliers, normalizing flows, deep equilibrium models
Exercises
Problem
Let be defined by . At which points is locally invertible? Compute the Jacobian and check its determinant.
Problem
The equation defines the unit sphere. Near which points can you express as a function of ? Compute using the implicit function theorem where it applies.
References
Canonical:
- Rudin, Principles of Mathematical Analysis, Chapter 9 (Theorems 9.24 and 9.28)
- Spivak, Calculus on Manifolds, Chapter 2
Current:
- Bai et al., "Deep Equilibrium Models" (NeurIPS 2019). implicit differentiation through fixed points
- Krantz and Parks, The Implicit Function Theorem (2003). comprehensive treatment
Next Topics
- Convex optimization basics: constrained optimization using Lagrange multipliers
- Convex duality: the role of implicit functions in duality theory
Last reviewed: April 2026
Prerequisites
Foundations this topic depends on.
- The Jacobian MatrixLayer 0A