P vs NP. Complexity Classes, NP-Completeness, and Cook-Levin

Why This Matters

The P vs NP question is not just the biggest open problem in computer science; it is one of the seven Millennium Prize Problems, with a one million dollar bounty from the Clay Mathematics Institute. If someone proved $P = NP$ , public-key cryptography would break, every discrete combinatorial NP search problem (planning, scheduling, theorem proving, MAP inference) would become tractable, and the entire landscape of AI would shift overnight. (A constructive $P = NP$ would not automatically resolve continuous global optimization or other problems outside NP; but virtually every combinatorial bottleneck in modern ML lives inside NP.)

For ML practitioners specifically: many core ML problems are NP-hard. Feature selection, optimal clustering, neural architecture search, finding the global minimum of a non-convex loss. Understanding P vs NP tells you why these problems are hard and why we rely on heuristics and approximations.

Mental Model

Think of it this way. Solving a Sudoku puzzle from scratch is hard. But if someone hands you a completed Sudoku, checking that it is valid takes a quick scan. P vs NP asks: is there a fundamental gap between finding solutions and checking solutions? Or is checking always just as hard as finding, once you are clever enough?

Formal Setup and Notation

We work with decision problems: problems whose answer is YES or NO. The input is encoded as a binary string of length $n$ .

Definition

The Class P

$P$ is the class of decision problems solvable by a deterministic Turing machine in time $O(n^k)$ for some constant $k$ .

Informally: problems where you can find the answer efficiently. Examples: sorting, shortest paths, linear programming, primality testing.

Definition

The Class NP

$NP$ is the class of decision problems where a YES answer can be verified in polynomial time given a certificate (witness) of polynomial length.

Formally, $L \in NP$ if and only if there exists a polynomial-time verifier $V$ and a polynomial $p$ such that:

$x \in L \iff \exists w \text{ with } |w| \leq p(|x|) \text{ such that } V(x, w) = 1$

Informally: problems where you can check a proposed solution efficiently. Examples: SAT, graph coloring, Hamiltonian cycle, subset sum.

Definition

NP-Hard and NP-Complete

A problem $Q$ is NP-hard if and only if every problem in $NP$ can be reduced to $Q$ in polynomial time. NP-hard problems are at least as hard as the hardest problems in NP.

A problem $Q$ is NP-complete if and only if it is both in $NP$ and NP-hard. These are the hardest problems within NP. If any NP-complete problem has a polynomial time algorithm, then $P = NP$ .

Definition

Polynomial-Time Reduction $A \leq_{P} B$

A polynomial-time reduction from problem $A$ to problem $B$ is a polynomial-time computable function $f$ such that:

$x \in A \iff f(x) \in B$

If $A \leq_P B$ and $B \in P$ , then $A \in P$ . Reductions transfer hardness upward: if $A$ is hard and $A \leq_P B$ , then $B$ is at least as hard.

The P vs NP Question

The question is simple to state:

$P \stackrel{?}{=} NP$

We know $P \subseteq NP$ because any problem you can solve in polynomial time you can certainly verify in polynomial time (just solve it again and check). The question is whether $NP \subseteq P$ : can every problem with efficiently checkable solutions also be efficiently solved?

Core Definitions

The satisfiability problem (SAT) asks: given a Boolean formula $\phi(x_1, \ldots, x_n)$ in conjunctive normal form (CNF), is there an assignment of truth values that makes $\phi$ true?

3-SAT restricts each clause to exactly 3 literals. Despite this restriction, 3-SAT is NP-complete.

The certificate (or witness) for a SAT instance is a satisfying assignment. Given the assignment, you can check each clause in linear time.

Main Theorems

Theorem

Cook-Levin Theorem

Statement

SAT is NP-complete. That is, SAT is in NP and every problem in NP is polynomial-time reducible to SAT.

Intuition

The proof encodes the entire computation of a nondeterministic Turing machine as a Boolean formula. If the machine accepts (has some accepting path), the formula is satisfiable. The formula has polynomial size because it only needs to describe polynomially many time steps. This is the "grandfather" reduction: it reduces every NP problem to SAT in one shot.

Proof Sketch

Let $L \in NP$ with verifier $V$ running in time $p(n)$ . For input $x$ of length $n$ , create Boolean variables encoding: (1) the contents of each tape cell at each time step, (2) the head position at each time step, (3) the state at each time step. Write clauses ensuring: the initial configuration matches $(x, w)$ , each step follows the transition function, and the machine reaches an accept state. The resulting formula $\phi_x(w)$ has size $O(p(n)^2)$ and is satisfiable if and only if $x \in L$ .

Why It Matters

Cook-Levin is the foundation of NP-completeness theory. Once you know SAT is NP-complete, you can prove other problems NP-complete by reducing SAT to them. This launched the entire field: Karp immediately showed 21 problems were NP-complete. Today thousands of NP-complete problems are known.

Failure Mode

The Cook-Levin theorem does not tell us whether $P = NP$ or $P \neq NP$ . It only tells us that SAT is a "representative" hard problem: solving SAT efficiently would solve everything in NP efficiently.

report a correction →

Proof Ideas and Templates Used

The core technique is polynomial-time reduction. To show problem $B$ is NP-complete:

Show $B \in NP$ (exhibit a polynomial-time verifier)
Pick a known NP-complete problem $A$
Construct a polynomial-time function $f$ such that $x \in A \iff f(x) \in B$

This template has been used thousands of times. Classic reductions:

SAT $\leq_P$ 3-SAT (clause splitting)
3-SAT $\leq_P$ Independent Set (gadget construction)
3-SAT $\leq_P$ Subset Sum (number encoding)

Canonical Examples

Example

Reduction from 3-SAT to Clique

Given a 3-SAT formula with $k$ clauses, build a graph: one node per literal per clause, edges between all nodes in different clauses whose literals are compatible (not a variable and its negation). The formula is satisfiable if and only if the graph has a clique of size $k$ . Each node in the clique gives a true literal from each clause.

Example

Why feature selection is NP-hard

Given a dataset with $d$ features, find the subset of $k$ features that minimizes prediction error. This can be shown NP-hard by reduction from Set Cover. In practice, we use greedy forward/backward selection, L1 regularization, or random search: these are all heuristics that avoid the exponential search over $\binom{d}{k}$ subsets.

Common Confusions

Watch Out

NP does not mean exponential time

NP stands for "nondeterministic polynomial," not "non-polynomial." NP is a class of problems defined by verifiability. Many NP problems are in P (every problem in P is also in NP). The hard ones are the NP-complete problems, which might require exponential time. But nobody has proved that.

Watch Out

NP-hard is not the same as NP-complete

NP-hard means "at least as hard as the hardest problems in NP." An NP-hard problem does not need to be in NP itself. The halting problem is NP-hard but not in NP (it is undecidable). NP-complete means NP-hard and in NP.

Watch Out

Worst case vs average case

NP-hardness is a worst-case statement. SAT is NP-complete, but random SAT instances are often easy. Many NP-hard optimization problems have good average case algorithms or approximation algorithms. This is exactly why ML heuristics work in practice despite the problems being NP-hard in theory.

Connection to Machine Learning

Many ML problems are NP-hard in the worst case:

Feature selection: finding the optimal subset of $k$ features
Optimal clustering: $k$ -means is NP-hard for general $k$ and $d$
Neural architecture search: finding the optimal architecture
Training neural networks: training a 3-node network of linear-threshold units to consistency on $\{0,1\}^d$ inputs is NP-complete (Blum & Rivest, 1992); related hardness results (Judd 1990; Boob, Dey, Lan 2022) extend to ReLU networks under specific architecture, loss, and encoding choices, but the precise statement always depends on those choices
MAP inference in graphical models

This does not mean we give up. It means we use:

Approximation algorithms with provable guarantees
Heuristics that work well in practice (SGD, greedy methods)
Relaxations (convex relaxation of combinatorial problems)
Structural assumptions that make instances tractable

Summary

$P$ = efficiently solvable; $NP$ = efficiently verifiable
$P \subseteq NP$ is trivial; whether $P = NP$ is open
Cook-Levin: SAT is NP-complete (the first and foundational result)
NP-complete problems are equivalent to each other under polynomial reductions
Most researchers believe $P \neq NP$ , but no proof exists
Many ML problems are NP-hard, which is why we use heuristics

Exercises

ExerciseCore

Problem

Show that the Clique problem is in NP. That is, given a graph $G$ and integer $k$ , describe a certificate and a polynomial-time verification procedure for deciding whether $G$ contains a clique of size $k$ .

ExerciseCore

Problem

If someone proves that $P = NP$ , why would RSA encryption break?

ExerciseAdvanced

Problem

Explain why proving $P \neq NP$ is so difficult. What proof techniques have been shown to be insufficient, and why?

References

Canonical:

Sipser, Introduction to the Theory of Computation, Chapters 7-8
Arora & Barak, Computational Complexity: A Modern Approach (2009), Chapters 2-3

Current:

Aaronson, "P vs NP" survey (2017)
Wigderson, Mathematics and Computation (2019), Chapters 1-6
Blum & Rivest, "Training a 3-Node Neural Network is NP-Complete," Neural Networks 5(1):117-127 (1992)

Next Topics

The natural next steps from P vs NP:

Knapsack problem: a canonical NP-complete problem with ML connections