Beta. Content is under active construction and has not been peer-reviewed. Report errors on GitHub.Disclaimer

Foundations

SAT, SMT, and Automated Reasoning

SAT solvers decide Boolean satisfiability (NP-complete). SMT solvers extend SAT with theories like arithmetic and arrays. These tools verify constraints, discharge proof obligations, and complement LLMs in AI agent pipelines.

AdvancedTier 2Current~55 min

Prerequisites

0

Why This Matters

SAT (Boolean satisfiability) is the canonical NP-complete problem: given a Boolean formula, is there an assignment of variables that makes it true? Despite worst-case exponential complexity, modern SAT solvers handle formulas with millions of variables in practice. SMT (Satisfiability Modulo Theories) extends SAT with domain-specific reasoning about integers, reals, arrays, and bitvectors.

These solvers matter for AI and ML in two ways. First, they are the backbone of formal verification: checking that a neural network satisfies safety properties, verifying that a program meets its specification, or proving that a hardware design is correct. Second, as AI agents become more capable, SAT/SMT solvers serve as reliable "reasoning backends" that can verify intermediate claims, check consistency of plans, and discharge arithmetic obligations. Solvers and LLMs are complementary: LLMs generate plausible reasoning, solvers verify it.

Prerequisites

This page assumes familiarity with P vs NP and the concept of NP-completeness.

Core Definitions

Definition

SAT (Boolean Satisfiability Problem)

Given a Boolean formula ϕ\phi over variables x1,,xnx_1, \ldots, x_n (using \land, \lor, ¬\neg), the SAT problem asks: does there exist an assignment x1=b1,,xn=bnx_1 = b_1, \ldots, x_n = b_n (with bi{true,false}b_i \in \{\text{true}, \text{false}\}) such that ϕ\phi evaluates to true?

A formula is satisfiable if such an assignment exists, and unsatisfiable otherwise.

Definition

Conjunctive Normal Form (CNF)

A formula is in CNF if it is a conjunction of clauses, where each clause is a disjunction of literals (variables or their negations):

ϕ=(l11l1k1)(lm1lmkm)\phi = (l_{11} \lor \cdots \lor l_{1k_1}) \land \cdots \land (l_{m1} \lor \cdots \lor l_{mk_m})

Every Boolean formula can be converted to an equisatisfiable CNF formula (with possible introduction of auxiliary variables). SAT solvers typically operate on CNF.

Definition

SMT (Satisfiability Modulo Theories)

An SMT problem asks whether a first-order formula is satisfiable with respect to a background theory TT. Common theories include:

  • Linear integer arithmetic (LIA): 2x+3y72x + 3y \leq 7
  • Linear real arithmetic (LRA): x/2+y>0x/2 + y > 0
  • Arrays: read/write operations with select(a,i)\text{select}(a, i) and store(a,i,v)\text{store}(a, i, v)
  • Bitvectors: fixed-width binary arithmetic (32-bit integers, overflow)
  • Uninterpreted functions: f(a)=f(b)    a=bf(a) = f(b) \implies a = b (congruence)

SMT solvers combine SAT solving with theory-specific decision procedures.

Main Theorems

Theorem

Cook-Levin Theorem

Statement

SAT is NP-complete. That is:

  1. SAT is in NP: given a satisfying assignment, it can be verified in polynomial time.
  2. Every problem in NP can be reduced to SAT in polynomial time.

Intuition

The computation of any polynomial-time verifier can be encoded as a Boolean formula. Each step of the computation, each cell of the tape, each state of the machine becomes a Boolean variable. The formula asserts "the machine accepts," and its satisfiability is equivalent to the original NP problem having a solution.

Proof Sketch

Let LNPL \in \text{NP} with verifier VV running in time p(n)p(n). Encode VV's computation as a Boolean formula:

  1. Create variables st,qs_{t,q} for "machine is in state qq at time tt."
  2. Create variables ct,i,ac_{t,i,a} for "tape cell ii contains symbol aa at time tt."
  3. Create variables ht,ih_{t,i} for "head is at position ii at time tt."
  4. Add clauses encoding: (a) the initial configuration matches the input, (b) each step follows the transition function, (c) the machine is in an accepting state at the final step.

The formula has O(p(n)2)O(p(n)^2) variables and clauses (polynomial in nn), and it is satisfiable if and only if VV accepts some certificate. The satisfying assignment encodes the certificate.

Why It Matters

Cook-Levin established the theory of NP-completeness. Once SAT is NP-complete, every new problem can be shown NP-complete by reduction from SAT (or from any other known NP-complete problem). This created a practical methodology: to show a problem is hard, reduce SAT to it.

It also explains why SAT solvers are so broadly useful: any problem in NP can be encoded as SAT. Scheduling, planning, verification, and combinatorial optimization all reduce to SAT.

Failure Mode

NP-completeness is a worst-case notion. SAT is NP-complete in the worst case, but many practical instances have exploitable structure (locality, symmetry, small clause width) that modern solvers exploit. The gap between worst-case theory and practical performance is enormous for SAT.

SAT Solving Algorithms

DPLL (Davis-Putnam-Logemann-Loveland, 1962):

  1. Unit propagation: if a clause has only one unassigned literal, set it to true.
  2. Pure literal elimination: if a variable appears only positively (or only negatively), set it accordingly.
  3. Branching: pick an unassigned variable, try true, then try false. Backtrack on contradiction.

CDCL (Conflict-Driven Clause Learning, 1996): CDCL extends DPLL with two key innovations:

  1. Conflict analysis: when a contradiction is reached, analyze why it occurred. Identify a set of variable assignments that caused the conflict.
  2. Learned clauses: add a new clause to the formula that prevents the same conflict from recurring. This prunes the search space.
  3. Non-chronological backtracking: jump back to the decision level that caused the conflict, not just one level up.

CDCL is the basis of all competitive modern SAT solvers (MiniSat, CaDiCaL, Kissat). On industrial benchmarks, CDCL solvers routinely handle formulas with millions of variables and clauses.

SMT Architecture

Modern SMT solvers (Z3, CVC5, Yices) use the DPLL(T) architecture:

  1. Abstract the formula: replace theory atoms (x+y>3x + y > 3) with Boolean variables (b1b_1).
  2. Use a SAT solver (CDCL) on the abstracted Boolean formula.
  3. When the SAT solver finds a satisfying assignment, pass the corresponding theory atoms to a theory solver.
  4. If the theory solver says the atoms are inconsistent (e.g., x>3x > 3 and x<2x < 2 simultaneously), generate a conflict clause and feed it back to the SAT solver.
  5. Repeat until SAT with a theory-consistent assignment, or UNSAT.

This separation of concerns keeps the architecture clean: the SAT solver handles Boolean structure, the theory solvers handle domain reasoning, and they communicate through conflict clauses.

Why Solvers Complement LLMs

LLMs and SAT/SMT solvers have complementary strengths:

CapabilityLLMSAT/SMT solver
FlexibilityHigh: handles natural language, ambiguous specificationsLow: requires formal specification
Correctness guaranteeNone: can hallucinateComplete: if it says UNSAT, the formula is unsatisfiable
ArithmeticUnreliable for multi-step computationExact
SearchBeam search, samplingExhaustive with pruning

For AI agents, the natural architecture is: use the LLM to translate informal requirements into formal constraints, then use a solver to check satisfiability or find solutions. The LLM provides flexibility; the solver provides correctness.

Common Confusions

Watch Out

NP-complete does not mean unsolvable

SAT is NP-complete, but modern solvers routinely solve practical instances with millions of variables. NP-completeness is a worst-case classification. Practical instances often have structure (small treewidth, clustering, phase transitions) that solvers exploit. The average-case and structured-case complexity of SAT is much better than the worst case.

Watch Out

SMT is not just SAT with extra steps

SMT solvers do not simply convert everything to SAT. They use specialized decision procedures for each theory (simplex for linear arithmetic, congruence closure for uninterpreted functions, etc.). The DPLL(T) architecture interleaves SAT solving with theory reasoning, which is more efficient than a pure SAT encoding for most theory-heavy problems.

Watch Out

Solvers are not obsolete because of LLMs

LLMs cannot replace SAT/SMT solvers. LLMs have no mechanism for guaranteed correct reasoning over formal constraints. A solver that returns UNSAT has proved unsatisfiability by exhaustive search. No LLM can provide this guarantee. The two tools serve different roles and are most powerful when combined.

Exercises

ExerciseCore

Problem

Convert the following Boolean formula into CNF and determine if it is satisfiable: (x1x2)(¬x1x3)(¬x2¬x3)(x1¬x2)(x_1 \lor x_2) \land (\neg x_1 \lor x_3) \land (\neg x_2 \lor \neg x_3) \land (x_1 \lor \neg x_2).

ExerciseAdvanced

Problem

Explain why conflict-driven clause learning (CDCL) is more efficient than naive DPLL on structured problems. What information does a learned clause encode?

ExerciseAdvanced

Problem

Sketch how you would use an SMT solver to verify that a ReLU neural network with one hidden layer always outputs a non-negative value on inputs in [0,1]d[0, 1]^d.

References

Canonical:

  • Cook, "The Complexity of Theorem-Proving Procedures" (1971)
  • Biere, Heule, van Maaren, Walsh, Handbook of Satisfiability (2nd ed., 2021)

Solvers:

  • de Moura & Bjorner, "Z3: An Efficient SMT Solver" (2008)
  • Marques-Silva, Lynce, Malik, "Conflict-Driven Clause Learning SAT Solvers" (2009)

Neural network verification:

  • Katz et al., "Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks" (2017)

Next Topics

Last reviewed: April 2026

Prerequisites

Foundations this topic depends on.