Beta. Content is under active construction and has not been peer-reviewed. Report errors on GitHub.Disclaimer

RL Theory

Minimax and Saddle Points

Minimax theorems characterize when max-min equals min-max. Saddle points arise in zero-sum games, duality theory, GANs, and robust optimization.

CoreTier 2Stable~50 min
0

Why This Matters

Minimax problems appear whenever two objectives conflict. In zero-sum games, one player maximizes what the other minimizes. In robust optimization, you minimize loss under the worst-case perturbation. In GANs, the generator minimizes what the discriminator maximizes. The minimax theorem tells you when you can swap the order of max and min without changing the value. When this swap is valid, the problem has a saddle point and is tractable. When it fails, the problem is harder.

Mental Model

Consider a function f(x,y)f(x, y) where one player chooses xx to maximize and the other chooses yy to minimize. The max-player moves first in maxxminyf(x,y)\max_x \min_y f(x, y): they choose xx knowing the min-player will respond optimally. The min-player moves first in minymaxxf(x,y)\min_y \max_x f(x, y): they choose yy knowing the max-player will respond optimally. Moving second is always at least as good, so maxxminyfminymaxxf\max_x \min_y f \leq \min_y \max_x f. The minimax theorem identifies conditions under which equality holds.

Core Definitions

Definition

Saddle Point

A point (x,y)(x^*, y^*) is a saddle point of f:X×YRf: \mathcal{X} \times \mathcal{Y} \to \mathbb{R} if:

f(x,y)f(x,y)f(x,y)for all xX,yYf(x^*, y) \leq f(x^*, y^*) \leq f(x, y^*) \quad \text{for all } x \in \mathcal{X}, \, y \in \mathcal{Y}

Equivalently: xx^* maximizes f(,y)f(\cdot, y^*) and yy^* minimizes f(x,)f(x^*, \cdot). Neither player can improve by unilateral deviation.

Definition

Minimax Value and Maximin Value

The maximin value is v=maxxXminyYf(x,y)\underline{v} = \max_{x \in \mathcal{X}} \min_{y \in \mathcal{Y}} f(x, y).

The minimax value is v=minyYmaxxXf(x,y)\overline{v} = \min_{y \in \mathcal{Y}} \max_{x \in \mathcal{X}} f(x, y).

The weak duality inequality always holds: vv\underline{v} \leq \overline{v}. When v=v\underline{v} = \overline{v}, we say the minimax theorem holds and the common value is the value of the game.

Definition

Convex-Concave Function

A function f:X×YRf: \mathcal{X} \times \mathcal{Y} \to \mathbb{R} is convex-concave if f(,y)f(\cdot, y) is convex for each fixed yy and f(x,)f(x, \cdot) is concave for each fixed xx. Equivalently, the max-player faces a concave objective (wants to maximize) and the min-player faces a convex objective (wants to minimize).

Main Theorems

Theorem

Von Neumann Minimax Theorem

Statement

Under the above conditions:

maxxXminyYf(x,y)=minyYmaxxXf(x,y)\max_{x \in \mathcal{X}} \min_{y \in \mathcal{Y}} f(x, y) = \min_{y \in \mathcal{Y}} \max_{x \in \mathcal{X}} f(x, y)

Moreover, a saddle point (x,y)(x^*, y^*) exists.

Intuition

Convexity-concavity plus compactness ensures that neither player benefits from going first. The convex-concave structure prevents "misaligned" curvature that would let one player exploit knowledge of the other's choice. Compactness ensures the optima are attained (not just approached in the limit).

Proof Sketch

One proof uses the separating hyperplane theorem. Define S={(u,v):uf(x,y),vf(x,y) for some (x,y)}S = \{(u, v) : u \leq f(x, y), v \geq f(x, y) \text{ for some } (x,y)\}. The convex-concave structure makes SS convex. If maximin << minimax, there is a gap, and a separating hyperplane argument yields a contradiction. An alternative proof uses Brouwer's fixed point theorem on the best-response correspondence.

Why It Matters

This theorem is the foundation of game theory and connects directly to linear programming duality. In a finite zero-sum game (matrix game), the strategy spaces are simplices (compact, convex) and the payoff is bilinear (convex-concave). Von Neumann's theorem guarantees a value and optimal mixed strategies. Strong duality in LP is a corollary: the primal and dual LP values are equal.

Failure Mode

Without convexity-concavity, the minimax gap can be strict. For example, let X=Y={0,1}\mathcal{X} = \mathcal{Y} = \{0, 1\} (not convex, just two points) and f(x,y)=1[x=y]f(x,y) = \mathbf{1}[x = y]. Then maxxminyf=0\max_x \min_y f = 0 (for any xx, the min-player matches) but minymaxxf=1\min_y \max_x f = 1 (for any yy, the max-player matches). The gap is 1. Convexifying the strategy sets (allowing mixed strategies) restores equality.

Without compactness, the optima may not be attained even when the function is convex-concave. Consider f(x,y)=xyf(x, y) = xy on X=Y=(0,)\mathcal{X} = \mathcal{Y} = (0, \infty): neither maxxminy\max_x \min_y nor minymaxx\min_y \max_x is well-defined.

Theorem

Sion Minimax Theorem

Statement

Under these conditions:

maxxXinfyYf(x,y)=infyYmaxxXf(x,y)\max_{x \in \mathcal{X}} \inf_{y \in \mathcal{Y}} f(x, y) = \inf_{y \in \mathcal{Y}} \max_{x \in \mathcal{X}} f(x, y)

Intuition

Sion's theorem relaxes Von Neumann's conditions in two ways: it requires quasiconvexity-quasiconcavity instead of convexity-concavity, and Y\mathcal{Y} need not be compact (only X\mathcal{X} must be compact). This covers cases where the min-player's domain is unbounded.

Proof Sketch

The proof uses a topological intersection argument. For each finite subset {y1,,yk}Y\{y_1, \ldots, y_k\} \subset \mathcal{Y}, the set {x:minjf(x,yj)v}\{x : \min_j f(x, y_j) \geq v\} is nonempty and closed by upper semicontinuity and quasiconcavity. The finite intersection property (from compactness of X\mathcal{X}) yields a point where infyf(x,y)v\inf_y f(x, y) \geq v for the critical value vv.

Why It Matters

Sion's theorem applies to infinite-dimensional settings (e.g., function spaces in variational problems) and to cases where strict convexity-concavity fails but quasiconvexity-quasiconcavity holds. It is the most general classical minimax result.

Failure Mode

Without compactness of at least one of the two sets, the theorem can fail even with quasiconvexity-quasiconcavity. Both sets being merely convex and unbounded is not enough.

Connection to Duality

Strong duality in linear programming is a special case of the minimax theorem. The LP primal-dual pair:

minx0cxs.t. Ax=bmaxybys.t. Ayc\min_{x \geq 0} c^\top x \quad \text{s.t. } Ax = b \quad \longleftrightarrow \quad \max_{y} b^\top y \quad \text{s.t. } A^\top y \leq c

can be written as minxmaxyL(x,y)\min_x \max_y L(x, y) where L(x,y)=cx+y(bAx)L(x, y) = c^\top x + y^\top(b - Ax) is the Lagrangian. This LL is linear (hence convex-concave) in (x,y)(x, y), and the constraint sets are convex. Strong duality (min=max\min = \max) follows from the minimax theorem.

More generally, for convex programs: strong duality holds when Slater's condition is satisfied, and the saddle point of the Lagrangian corresponds to the primal-dual optimal pair.

Connection to GANs

The GAN objective is:

minGmaxDExpdata[logD(x)]+Ezpz[log(1D(G(z)))]\min_G \max_D \mathbb{E}_{x \sim p_{\text{data}}}[\log D(x)] + \mathbb{E}_{z \sim p_z}[\log(1 - D(G(z)))]

For fixed GG, the optimal discriminator is D(x)=pdata(x)/(pdata(x)+pG(x))D^*(x) = p_{\text{data}}(x)/(p_{\text{data}}(x) + p_G(x)), and the resulting objective for GG is 2JSD(pdatapG)log42 \, \text{JSD}(p_{\text{data}} \| p_G) - \log 4, where JSD is the Jensen-Shannon divergence. The minimax theorem does not directly apply because the generator and discriminator are parameterized by neural networks (non-convex). Training GANs is therefore a non-convex saddle-point problem, and convergence guarantees are limited.

Connection to Robust Optimization

In distributionally robust optimization (DRO):

minθmaxPUEP[(θ,X)]\min_\theta \max_{P \in \mathcal{U}} \mathbb{E}_P[\ell(\theta, X)]

where U\mathcal{U} is an uncertainty set of distributions. When U\mathcal{U} is convex and compact (in an appropriate topology) and the loss is convex in θ\theta, the minimax theorem applies. This connects to adversarial training: the inner maximization finds the worst-case distribution, and the outer minimization hedges against it.

Canonical Examples

Example

Matrix game (Rock-Paper-Scissors)

The payoff matrix for Rock-Paper-Scissors is:

A=[011101110]A = \begin{bmatrix} 0 & -1 & 1 \\ 1 & 0 & -1 \\ -1 & 1 & 0 \end{bmatrix}

The row player maximizes xAyx^\top A y, the column player minimizes. Strategy spaces are the 3-simplex (compact, convex). The payoff is bilinear (convex-concave in mixed strategies). By Von Neumann's theorem, the game has value 0 and the unique optimal strategy for both players is (1/3,1/3,1/3)(1/3, 1/3, 1/3).

Example

Strong duality in LP

Primal: min3x1+5x2\min \, 3x_1 + 5x_2 subject to x1+x24x_1 + x_2 \geq 4, x1,x20x_1, x_2 \geq 0. Dual: max4y\max \, 4y subject to y3y \leq 3, y5y \leq 5, y0y \geq 0. The dual optimum is y=3y^* = 3, giving value 12. The primal optimum is x=(4,0)x^* = (4, 0), also giving value 12. Strong duality holds: min=max=12\min = \max = 12. The Lagrangian L(x,y)=3x1+5x2+y(4x1x2)L(x, y) = 3x_1 + 5x_2 + y(4 - x_1 - x_2) has a saddle point at (x,y)(x^*, y^*).

Common Confusions

Watch Out

Minimax is not the same as min-max (sequential optimization)

In minymaxxf(x,y)\min_y \max_x f(x, y), the min-player commits to yy first, then the max-player responds optimally. This is different from alternating gradient descent-ascent, where both players update simultaneously. Alternating updates can cycle or diverge for non-convex-concave problems. The minimax theorem is about the existence of a saddle point, not about an algorithm to find it.

Watch Out

GANs do not satisfy the minimax theorem assumptions

The GAN objective is non-convex in the generator parameters. Von Neumann's theorem does not apply. The "theoretical optimum" analysis (showing the global optimum is pG=pdatap_G = p_{\text{data}}) assumes the discriminator is optimized exactly for each GG, which never happens in practice. GAN training is a non-convex saddle-point problem without convergence guarantees from classical minimax theory.

Watch Out

Weak duality always holds, strong duality requires conditions

maxxminyfminymaxxf\max_x \min_y f \leq \min_y \max_x f is true for any ff (the "max-min inequality"). This is a one-line proof: minyf(x,y)f(x,y)maxxf(x,y)\min_y f(x, y) \leq f(x, y) \leq \max_x f(x, y), then take maxx\max_x on the left and miny\min_y on the right. Equality (strong duality) requires structure: convexity-concavity plus compactness, or Slater's condition, or specific problem structure like LP.

Summary

  • Weak duality: maxxminyfminymaxxf\max_x \min_y f \leq \min_y \max_x f always holds
  • Von Neumann: equality holds for continuous convex-concave ff on compact convex sets
  • Saddle point (x,y)(x^*, y^*): f(x,y)f(x,y)f(x,y)f(x^*, y) \leq f(x^*, y^*) \leq f(x, y^*)
  • Strong LP duality is a corollary of the minimax theorem
  • GANs are minimax problems but violate the convexity assumption
  • Robust optimization: minimax over uncertainty sets

Exercises

ExerciseCore

Problem

Consider the function f(x,y)=x2y2f(x, y) = x^2 - y^2 on X=[1,1]\mathcal{X} = [-1, 1], Y=[1,1]\mathcal{Y} = [-1, 1]. Find maxxminyf(x,y)\max_x \min_y f(x,y) and minymaxxf(x,y)\min_y \max_x f(x,y). Does a saddle point exist?

ExerciseAdvanced

Problem

Prove the weak duality inequality maxxminyf(x,y)minymaxxf(x,y)\max_x \min_y f(x,y) \leq \min_y \max_x f(x,y) for any function f:X×YRf: \mathcal{X} \times \mathcal{Y} \to \mathbb{R}, assuming the max and min exist.

References

Canonical:

  • Von Neumann, "Zur Theorie der Gesellschaftsspiele" (1928)
  • Sion, "On General Minimax Theorems" (1958)

Current:

  • Boyd & Vandenberghe, Convex Optimization (2004), Chapter 5 (duality)
  • Goodfellow et al., "Generative Adversarial Nets" (2014)
  • Ben-Tal, El Ghaoui, Nemirovski, Robust Optimization (2009), Chapters 1-2

Next Topics

Last reviewed: April 2026

Prerequisites

Foundations this topic depends on.

Next Topics