Decision Theory
Nash Equilibrium
No player can improve by unilateral deviation. Existence via Kakutani fixed-point theorem, computation complexity (PPAD-completeness), refinements, and why Nash equilibria can be inefficient (price of anarchy).
Prerequisites
Why This Matters
Nash equilibrium is the central solution concept in non-cooperative game theory. Whenever you train a GAN, analyze multi-agent RL convergence, design an auction, or study adversarial robustness, you are reasoning about Nash equilibria. The concept pins down what "stable behavior" means when multiple optimizers interact.
Three questions matter in practice: Does an equilibrium exist? Can you compute it? Is it any good? The answers are: always (for finite games), probably not efficiently (PPAD-complete), and sometimes not (price of anarchy can be unbounded). This page covers all three.
Formal Definition
Nash Equilibrium
A strategy profile is a Nash equilibrium of a normal-form game if no player can strictly improve their expected payoff by unilateral deviation:
Notation: denotes the strategies of all players except . A strict Nash equilibrium requires strict inequality for all .
Epsilon-Nash Equilibrium
A strategy profile is an -Nash equilibrium if no player can improve their payoff by more than :
This relaxation is important computationally. While exact Nash equilibria are hard to compute, approximate equilibria may be tractable.
Support of a Mixed Strategy
The support of a mixed strategy is the set of pure strategies played with positive probability: .
At a Nash equilibrium, every pure strategy in the support of yields the same expected payoff (player is indifferent among them). Any pure strategy outside the support yields equal or lower expected payoff.
Existence Theorem
Nash Existence Theorem (1950)
Statement
Every finite game (finite players, finite strategies) has at least one Nash equilibrium in mixed strategies.
Intuition
The proof uses topology. The mixed strategy space is a product of simplices (compact and convex). The best-response correspondence maps this space to itself and satisfies the conditions of Kakutani's fixed-point theorem. A fixed point of the best-response correspondence is exactly a Nash equilibrium.
Proof Sketch
Let be the product of all mixed strategy simplices.
Step 1. is a nonempty, compact, convex subset of Euclidean space.
Step 2. Define the best-response correspondence by:
Step 3. Verify Kakutani conditions:
- Nonempty values: The maximum of a continuous function on a compact set is attained.
- Convex values: If and are both best responses to , then is also a best response because is linear in (multilinearity of expected payoffs).
- Closed graph: If and with , continuity of implies .
Step 4. Kakutani's fixed-point theorem gives , meaning every player simultaneously best-responds. This is a Nash equilibrium.
Why It Matters
The theorem guarantees that the equilibrium concept is not vacuous. For any finite game, rational agents have a consistent prediction of play. Nash's contribution was showing that a simple fixed-point argument suffices. The theorem earned Nash the 1994 Nobel Prize in Economics (shared with Harsanyi and Selten).
Failure Mode
The theorem says nothing about uniqueness, stability, or computability. Many games have multiple equilibria with different payoffs, making equilibrium selection a separate (and unsolved) problem. The proof is non-constructive: it guarantees existence without providing an algorithm. Computing Nash equilibria turns out to be PPAD-complete.
Uniqueness Conditions
Not all games have a unique Nash equilibrium. Sufficient conditions for uniqueness include:
- Strictly dominance-solvable games: IEDS leaves a single profile (e.g., Prisoner's Dilemma).
- Contraction games: If the best-response map is a contraction in some metric, the fixed point is unique by Banach's theorem.
- Potential games with a strictly concave potential: The potential has a unique maximizer.
- Two-player zero-sum games: The equilibrium value is unique, though the equilibrium strategies may not be.
Computational Complexity
PPAD (Polynomial Parity Arguments on Directed graphs)
A complexity class containing problems whose solutions are guaranteed to exist by parity arguments on directed graphs. The canonical PPAD-complete problem is finding the end of a path in a directed graph where vertices have in-degree and out-degree at most 1, given one endpoint.
Nash Equilibrium is PPAD-Complete
Statement
Computing a Nash equilibrium of a two-player game is PPAD-complete (Daskalakis, Goldberg, and Papadimitriou, 2009; Chen and Deng, 2006). This holds even for finding an -Nash equilibrium for exponentially small .
Intuition
PPAD-completeness means that finding a Nash equilibrium is at least as hard as any problem in PPAD. Since PPAD is believed not to be in P (it would imply unlikely collapses of complexity classes), there is likely no polynomial-time algorithm for computing Nash equilibria of general games. The hardness arises from the topological nature of the existence proof: the equilibrium exists by a fixed-point argument, but finding the fixed point requires exploring an exponentially large space.
Proof Sketch
The proof reduces the problem of finding an approximate Brouwer fixed point (known to be PPAD-complete) to finding a Nash equilibrium. Given a continuous function , construct a game whose Nash equilibria correspond to approximate fixed points of . The construction uses a graphical game encoding the function evaluation via player interactions.
Why It Matters
This result has practical consequences. No known algorithm computes Nash equilibria of general games in polynomial time. The Lemke-Howson algorithm for two-player games is exponential in the worst case. Support enumeration is exponential. For large games (e.g., poker with information sets), exact Nash computation is intractable, motivating approximate methods like counterfactual regret minimization (CFR).
Failure Mode
PPAD-completeness applies to the problem of finding any Nash equilibrium. For specific game structures (zero-sum, potential games, congestion games), polynomial-time algorithms exist. Zero-sum games reduce to linear programming. Potential games can be solved by maximizing the potential function.
Equilibrium Refinements
Not all Nash equilibria are equally plausible. Refinements impose additional requirements:
Subgame Perfect Equilibrium
In extensive-form games (games with sequential moves), a strategy profile is subgame perfect if it induces a Nash equilibrium in every subgame. This eliminates Nash equilibria that rely on non-credible threats. Found by backward induction in finite games of perfect information.
Trembling-Hand Perfect Equilibrium
A Nash equilibrium is trembling-hand perfect if it is the limit of a sequence of totally mixed strategy profiles (every strategy played with positive probability) that are best responses to each other. This eliminates equilibria that rely on the assumption that opponents never make mistakes.
Correlated Equilibrium
A correlated equilibrium is a probability distribution over strategy profiles such that, if a mediator samples and privately tells each player their component , no player wants to deviate. Formally, for each player and each with :
Every Nash equilibrium is a correlated equilibrium (with product distribution), but correlated equilibria can achieve higher total payoffs. A correlated equilibrium can be computed in polynomial time via a linear program, unlike Nash equilibria.
Price of Anarchy
Price of Anarchy (PoA)
For a game with a social welfare function , the price of anarchy is:
It measures how much social welfare is lost in the worst Nash equilibrium compared to the social optimum. means all equilibria are socially optimal. Large PoA means selfish behavior can be very costly.
Braess's Paradox
Consider a traffic network where 4000 drivers travel from A to D. Two routes exist: A-B-D and A-C-D. Adding a shortcut from B to C (with zero travel time) creates a new equilibrium where all drivers use A-B-C-D, and total travel time increases. The price of anarchy exceeds 1.
This paradox shows that adding capacity to a network can make the equilibrium worse. It demonstrates why Nash equilibria can be socially inefficient and why network design must account for selfish routing.
Connections to ML
GAN training dynamics. The GAN objective defines a two-player zero-sum game. The Nash equilibrium satisfies . In practice, simultaneous gradient descent often fails to converge to this equilibrium. The dynamics can cycle, diverge, or reach limit cycles. Techniques like spectral normalization, two-time-scale updates, and gradient penalties improve convergence toward the equilibrium.
Multi-agent RL. In Markov games, agents simultaneously learn policies via self-play and multi-agent RL. Independent Q-learning does not converge to Nash equilibria in general-sum games. Algorithms like Nash-Q (Hu and Wellman, 2003) compute Nash equilibria at each stage but require solving a game at every step, which is expensive. For zero-sum Markov games, minimax-Q learning converges under standard conditions.
Mechanism design. The revelation principle shows that for any Nash equilibrium of any mechanism, there exists a direct mechanism where truth-telling is a Nash equilibrium with the same outcome. This connects Nash equilibria to the design of incentive-compatible mechanisms.
Common Confusions
Nash equilibrium is not a prediction of what will happen
Nash equilibrium is a consistency requirement: if all players believe others will play the equilibrium, no one wants to deviate. It does not explain how players arrive at the equilibrium. In games with multiple equilibria, the theory provides no selection mechanism. Experimental evidence shows that human players often do not play Nash equilibria, especially in complex games.
Computing Nash equilibrium is hard, but special cases are easy
PPAD-completeness applies to general games. Zero-sum games reduce to LP (solvable in polynomial time). Potential games can be solved by convex optimization. Congestion games always have pure Nash equilibria found by best-response dynamics. Before declaring a problem hard, check if it has special structure.
Mixed Nash equilibria are fragile
In a mixed Nash equilibrium, each player is exactly indifferent between all strategies in their support. Any small change in payoffs can change the equilibrium drastically. This makes mixed equilibria sensitive to modeling assumptions. Pure Nash equilibria (when they exist) are more robust to perturbations.
Exercises
Problem
Consider a three-player game where each player chooses . Player receives payoff 1 if the majority choice matches their choice, and 0 otherwise. Find all pure strategy Nash equilibria.
Problem
Show that every strict Nash equilibrium is a trembling-hand perfect equilibrium.
Problem
Prove that every finite potential game has at least one pure strategy Nash equilibrium.
Problem
The Lemke-Howson algorithm finds a Nash equilibrium of a two-player game by following a path of complementary pivots. Show that the path length can be exponential by describing the Savani-von Stengel construction that produces games where Lemke-Howson takes steps on games.
References
Canonical:
- Nash, "Non-Cooperative Games" (1951), Annals of Mathematics
- Osborne & Rubinstein, A Course in Game Theory (1994), Chapters 2-4
- Fudenberg & Tirole, Game Theory (1991), Chapters 1-2, 11-12
Computational:
- Daskalakis, Goldberg & Papadimitriou, "The Complexity of Computing a Nash Equilibrium" (2009), SIAM Journal on Computing
- Chen & Deng, "Settling the Complexity of Two-Player Nash Equilibrium" (2006), FOCS
- Savani & von Stengel, "Hard-to-Solve Bimatrix Games" (2006), Econometrica
ML Connections:
- Hu & Wellman, "Nash Q-Learning for General-Sum Stochastic Games" (2003), JMLR
- Goodfellow et al., "Generative Adversarial Nets" (2014), NeurIPS
Next Topics
Building on Nash equilibrium theory:
- Mechanism design: designing games with desirable equilibria
- Minimax theorem: the special structure of equilibria in zero-sum games
Last reviewed: April 2026
Prerequisites
Foundations this topic depends on.
- Game Theory FoundationsLayer 2
- Common Probability DistributionsLayer 0A
- Sets, Functions, and RelationsLayer 0A
- Basic Logic and Proof TechniquesLayer 0A
- Convex Optimization BasicsLayer 1
- Differentiation in RnLayer 0A
- Matrix Operations and PropertiesLayer 0A
Builds on This
- Auction TheoryLayer 3
- Mechanism DesignLayer 3