Decision Theory
Game Theory Foundations
Strategic interaction between rational agents. Normal-form games, dominant strategies, Nash equilibrium existence, mixed strategies, and connections to minimax, mechanism design, and multi-agent RL.
Why This Matters
Any setting where multiple agents optimize simultaneously is a game. GAN training is a two-player game between the generator and discriminator. Multi-agent reinforcement learning agents play games against each other. Auction design, federated learning incentive structures, and adversarial robustness all require game-theoretic reasoning. Without this vocabulary, you cannot state what "equilibrium" means for these systems, let alone analyze convergence or efficiency.
Game theory provides the formal language for strategic interaction: what rational agents should do when their payoffs depend on the actions of others.
Mental Model
A game specifies (1) who the players are, (2) what actions each player can take, and (3) what payoff each player receives for each combination of actions. The central question: given that every player is rational and knows the game structure, what will they do? The answer, in most cases, is a Nash equilibrium.
Normal-Form Games
Normal-Form Game
A normal-form (strategic-form) game consists of:
- A finite set of players
- For each player , a finite set of pure strategies
- For each player , a payoff function
A strategy profile is a tuple . Player receives payoff when profile is played. In two-player games, payoffs are often written as a bimatrix where is player 1's payoff and is player 2's payoff when player 1 plays row and player 2 plays column .
Dominant Strategy
A strategy is a strictly dominant strategy for player if for every alternative strategy and every opponent profile :
If the inequality is weak (), the strategy is weakly dominant. A strictly dominant strategy is uniquely optimal regardless of what others do.
Best Response
Strategy is a best response to the opponent profile if:
The best-response correspondence maps each opponent profile to the set of best responses. This correspondence is central to the existence proof for Nash equilibria.
Iterated Elimination of Dominated Strategies (IEDS)
Repeatedly remove strictly dominated strategies from every player's strategy set. The order of elimination does not affect the final result for strict dominance. If IEDS leaves a unique strategy profile, the game is dominance solvable.
Canonical Examples
Prisoner's Dilemma
Two suspects choose independently to cooperate (stay silent) or defect (betray). Payoff matrix:
| Cooperate | Defect | |
|---|---|---|
| Cooperate | ||
| Defect |
Defect is strictly dominant for both players: regardless of the other's choice, defecting yields a higher payoff. The unique Nash equilibrium is (Defect, Defect) with payoffs . This is Pareto-dominated by (Cooperate, Cooperate) at . Rational self-interest leads to a collectively worse outcome.
Battle of the Sexes
Two players prefer to coordinate but disagree on which activity. Payoff matrix:
| Opera | Football | |
|---|---|---|
| Opera | ||
| Football |
Two pure Nash equilibria: (Opera, Opera) and (Football, Football). There is also a mixed Nash equilibrium where player 1 plays Opera with probability and player 2 plays Opera with probability . The mixed equilibrium gives expected payoff to each player, worse than either pure equilibrium. This illustrates that mixed equilibria can be inefficient.
Matching Pennies
A strictly competitive (zero-sum) game with no pure Nash equilibrium:
| Heads | Tails | |
|---|---|---|
| Heads | ||
| Tails |
For any pure strategy, the opponent has a profitable deviation. The unique Nash equilibrium is mixed: both players play Heads with probability . The expected payoff is for both players. This is the simplest example showing why mixed strategies are necessary.
Mixed Strategies
Mixed Strategy
A mixed strategy for player is a probability distribution over the pure strategy set . The set of all mixed strategies is the simplex .
The expected payoff under mixed strategy profile is:
A key property: is multilinear in the mixed strategies. It is linear in for fixed .
Nash Equilibrium (Mixed)
A mixed strategy profile is a Nash equilibrium if no player can improve their expected payoff by unilateral deviation:
Equivalently, every player's mixed strategy is a best response to the others. A pure strategy Nash equilibrium is the special case where all are degenerate (point mass on a single strategy).
Nash's Existence Theorem
Nash Existence Theorem
Statement
Every finite normal-form game (finite players, finite strategy sets) has at least one Nash equilibrium in mixed strategies.
Intuition
Each player's best-response correspondence maps the opponents' mixed strategies (a compact convex set) to a convex subset of the player's own mixed strategy simplex. The combined best-response correspondence maps the product of simplices to itself. Kakutani's fixed-point theorem guarantees a fixed point, which is precisely a Nash equilibrium: a profile where every player is simultaneously best-responding.
Proof Sketch
Define the combined strategy space . This is a nonempty, compact, convex subset of .
Define the best-response correspondence by .
Verify the conditions of Kakutani's fixed-point theorem:
- is nonempty, compact, and convex.
- is nonempty (finite games always have best responses).
- is convex (if and are both best responses, any mixture is also a best response because is linear in ).
- has a closed graph (by continuity of in all strategies).
By Kakutani's theorem, there exists , which is a Nash equilibrium.
Why It Matters
This theorem guarantees that the equilibrium concept is never vacuous for finite games. Every finite game has a well-defined prediction of rational play. In ML, this means that the Nash equilibrium of the GAN game exists (in mixed strategies over generator and discriminator parameters, though the strategy spaces are infinite in practice). It also justifies the study of multi-agent RL equilibria.
Failure Mode
The theorem guarantees existence but says nothing about uniqueness, computational tractability, or stability. A game may have exponentially many Nash equilibria. Computing a single Nash equilibrium of a general game is PPAD-complete, so no known polynomial-time algorithm exists. The equilibrium may also be unstable: small perturbations can cause the dynamics to diverge. See Nash equilibrium for these complications.
Zero-Sum Games and Minimax
Zero-Sum Game
A two-player game is zero-sum if for every strategy profile . One player's gain is the other's loss. The payoff is fully determined by a single matrix : player 1 receives and player 2 receives .
In zero-sum games, the Nash equilibrium has a special structure. Player 1 maximizes the minimum guaranteed payoff (maximin), and player 2 minimizes the maximum payoff player 1 can achieve (minimax). Von Neumann's minimax theorem shows that these values are equal:
This common value is the value of the game. The minimax theorem is the foundational result of game theory and connects to LP duality: computing the minimax strategies is a linear program.
Connections to Machine Learning
GANs. The GAN objective is a two-player zero-sum game. The generator minimizes and the discriminator maximizes . The optimal discriminator theorem characterizes the Nash equilibrium: .
Multi-agent RL. When multiple RL agents interact in a shared environment, the joint optimization is a stochastic game (Markov game). Nash equilibria of the stage game determine equilibrium policies. Convergence of independent learners to Nash equilibrium is not guaranteed in general.
Adversarial robustness. Adversarial attacks frame robustness as a game between a classifier and an adversary who perturbs inputs. The robust optimization formulation is a minimax problem.
Shapley value. The Shapley value from cooperative game theory assigns a fair attribution of value to each player (feature). SHAP values for model interpretability are exactly Shapley values of a cooperative game defined by the model's predictions.
Common Confusions
Nash equilibrium does not mean optimal outcome
The Prisoner's Dilemma demonstrates that Nash equilibria can be Pareto-dominated. Rational play by all agents does not guarantee a collectively good outcome. The Nash equilibrium is a stability concept (no one wants to deviate), not an efficiency concept. The gap between the Nash equilibrium and the social optimum is measured by the price of anarchy.
Mixed strategies are not randomization for its own sake
A mixed strategy equilibrium means that each player is exactly indifferent between all strategies in the support of their mixture. The mixing probabilities are determined by the requirement that the opponent be indifferent. In Matching Pennies, player 1 mixes 50-50 not because randomness helps them directly, but because this is the only way to prevent the opponent from exploiting a predictable pattern.
Dominant strategy equilibrium is much stronger than Nash
A dominant strategy equilibrium means each player has a strategy that is optimal regardless of what others do. Nash equilibrium only requires each strategy to be optimal given what others are doing. Dominant strategy equilibria are rare and do not require players to know or predict opponent strategies. The Prisoner's Dilemma has a dominant strategy equilibrium. Most games do not.
Exercises
Problem
In the Battle of the Sexes game above, compute the mixed strategy Nash equilibrium. Let player 1 play Opera with probability and player 2 play Opera with probability . Find and and the expected payoff to each player.
Problem
Consider a two-player zero-sum game with payoff matrix . Does either player have a dominant strategy? Find the value of the game and the mixed strategy Nash equilibrium.
Problem
Prove that in any two-player zero-sum game with payoff matrix , the set of Nash equilibrium strategies for player 1 is a convex polytope. Show that if and are both equilibrium strategies for player 1, then any convex combination for is also an equilibrium strategy for player 1.
Problem
The Shapley value for player in a cooperative game with characteristic function is:
Prove that the Shapley value satisfies the efficiency axiom: .
References
Canonical:
- Osborne & Rubinstein, A Course in Game Theory (1994), Chapters 1-3
- Fudenberg & Tirole, Game Theory (1991), Chapters 1-2, 11
- Myerson, Game Theory: Analysis of Conflict (1991), Chapters 1-3
Current:
- Shoham & Leyton-Brown, Multiagent Systems (2009), Chapters 3-4, 13
- Nisan, Roughgarden, Tardos & Vazirani, Algorithmic Game Theory (2007), Chapters 1-2
- Goodfellow et al., "Generative Adversarial Nets" (2014), Section 4 (game-theoretic analysis)
Next Topics
Natural extensions from game theory foundations:
- Nash equilibrium: existence proofs, computational complexity, refinements, price of anarchy
- Mechanism design: inverse game theory, designing rules for self-interested agents
- Minimax theorem: the fundamental theorem of zero-sum games and its connections to LP duality
Last reviewed: April 2026
Prerequisites
Foundations this topic depends on.
- Common Probability DistributionsLayer 0A
- Sets, Functions, and RelationsLayer 0A
- Basic Logic and Proof TechniquesLayer 0A
- Convex Optimization BasicsLayer 1
- Differentiation in RnLayer 0A
- Matrix Operations and PropertiesLayer 0A
Builds on This
- Auction TheoryLayer 3
- Commons Governance and Institutional AnalysisLayer 3
- Mechanism DesignLayer 3
- Von Neumann Minimax TheoremLayer 2
- Nash EquilibriumLayer 2