Foundations

Computability Theory

What can be computed? Turing machines, decidability, the Church-Turing thesis, recursive and recursively enumerable sets, reductions, Rice's theorem, and connections to learning theory.

CoreTier 1Stable~50 min

Prerequisites

Basic Logic and Proof Techniques Sets Functions and Relations

Quiz (4)

Why This Matters

Before asking whether a problem is efficient, you must ask whether it is solvable at all. Computability theory draws the boundary between problems that any algorithm can solve and problems that no algorithm can solve, regardless of time or space.

This is not abstract philosophy. In ML, certain concept classes are provably not PAC-learnable for computability reasons, not sample complexity reasons. If the hypothesis class involves solving an undecidable problem during evaluation, no learner can succeed. The halting problem is the canonical example of an undecidable problem, and its proof technique (diagonalization) recurs throughout set theory, complexity theory, and learning theory.

Core Definitions

Definition

Turing Machine

A Turing machine is a 7-tuple $M = (Q, \Sigma, \Gamma, \delta, q_0, q_{\text{accept}}, q_{\text{reject}})$ where:

$Q$ is a finite set of states
$\Sigma$ is a finite input alphabet (not containing the blank symbol $\sqcup$ )
$\Gamma \supseteq \Sigma \cup \{\sqcup\}$ is the tape alphabet
$\delta: Q \times \Gamma \to Q \times \Gamma \times \{L, R\}$ is the transition function
$q_0 \in Q$ is the start state
$q_{\text{accept}}, q_{\text{reject}} \in Q$ are the accepting and rejecting states

The machine reads from an infinite tape, one cell at a time, writes a symbol, moves left or right, and transitions to a new state. It halts when it enters $q_{\text{accept}}$ or $q_{\text{reject}}$ .

Definition

Decidable Language (Recursive Set)

A language $L \subseteq \Sigma^*$ is decidable (recursive) if there exists a Turing machine $M$ that halts on every input and accepts exactly the strings in $L$ . That is, for every $w \in \Sigma^*$ , $M$ halts and outputs "accept" if $w \in L$ or "reject" if $w \notin L$ .

Definition

Semi-Decidable Language (Recursively Enumerable Set)

A language $L \subseteq \Sigma^*$ is semi-decidable (recursively enumerable, or r.e.) if there exists a Turing machine $M$ that accepts exactly the strings in $L$ . If $w \in L$ , then $M$ halts and accepts. If $w \notin L$ , then $M$ may reject or may run forever.

Definition

Many-One Reduction

Language $A$ is many-one reducible to language $B$ , written $A \leq_m B$ , if there exists a computable function $f: \Sigma^* \to \Sigma^*$ such that for all $w$ :

$w \in A \iff f(w) \in B$

If $A \leq_m B$ and $B$ is decidable, then $A$ is decidable. Equivalently, if $A$ is undecidable and $A \leq_m B$ , then $B$ is undecidable.

The Church-Turing Thesis

The Church-Turing thesis is not a theorem. It is a claim about the physical world: every function that is "effectively computable" by any mechanical procedure is computable by a Turing machine. Evidence for this thesis includes the fact that every alternative model of computation (lambda calculus, recursive functions, register machines, cellular automata) computes exactly the same class of functions.

The thesis has practical force: if you cannot solve a problem with a Turing machine, you cannot solve it with any physical device. Quantum computers do not change this. They may be faster, but they compute the same set of functions.

The Halting Problem

Theorem

Undecidability of the Halting Problem

Statement

The halting problem $\text{HALT} = \{\langle M, w \rangle : M \text{ is a TM that halts on input } w\}$ is undecidable. There is no Turing machine that, given an arbitrary Turing machine $M$ and input $w$ , always correctly determines whether $M$ halts on $w$ .

Intuition

If a halting decider existed, you could use it to construct a machine that does the opposite of what it is predicted to do. This self-referential contradiction is the same diagonal argument that Cantor used to prove the reals are uncountable.

Proof Sketch

Assume for contradiction that $H$ decides HALT: $H(\langle M, w \rangle)$ accepts if $M$ halts on $w$ and rejects otherwise. Construct a new machine $D$ : on input $\langle M \rangle$ , run $H(\langle M, \langle M \rangle \rangle)$ . If $H$ accepts (meaning $M$ halts on its own description), then $D$ loops forever. If $H$ rejects (meaning $M$ does not halt on its own description), then $D$ halts.

Now run $D$ on $\langle D \rangle$ . If $D$ halts on $\langle D \rangle$ , then $H$ accepts, so $D$ loops. If $D$ does not halt on $\langle D \rangle$ , then $H$ rejects, so $D$ halts. Both cases yield a contradiction. Therefore $H$ cannot exist.

Why It Matters

The halting problem is the canonical undecidable problem. Most undecidability results in computer science are proved by reducing from HALT. It also shows that general program verification is impossible: no tool can check all programs for all properties. This is why static analysis, type checking, and formal verification must all restrict the class of programs or properties they consider.

Failure Mode

The proof requires the ability of Turing machines to simulate other Turing machines (universality). In restricted computational models (finite automata, pushdown automata), the halting problem is decidable because these models always halt or can be checked for loops in bounded time.

Decidability Hierarchy

The three-level classification:

Class	Definition	Closure Properties	Example
Decidable	TM halts on all inputs, accepts $L$	Union, intersection, complement	$\{0^n 1^n : n \geq 0\}$
Semi-decidable (r.e.)	TM accepts $L$ , may loop on $w \notin L$	Union, intersection	HALT
Co-semi-decidable (co-r.e.)	Complement is semi-decidable	Union, intersection	$\overline{\text{HALT}}$

A language is decidable if and only if it is both semi-decidable and co-semi-decidable. This gives a useful proof technique: to show a language is decidable, show that both it and its complement are r.e.

Rice's Theorem

Theorem

Rice's Theorem

Statement

Let $P$ be any nontrivial property of r.e. languages (i.e., some TMs have the property and some do not). Then the set

$\{\langle M \rangle : L(M) \text{ has property } P\}$

is undecidable.

Intuition

You cannot decide anything nontrivial about what a program computes by inspecting it. "Does this program ever output 1?" is undecidable. "Does this program compute a total function?" is undecidable. "Does this program compute the same function as that program?" is undecidable. The only decidable properties of programs are trivial ones (satisfied by all programs or no programs) or syntactic ones (about the code text, not its behavior).

Proof Sketch

Reduce from HALT. Given the property $P$ , let $M_P$ be a machine whose language has property $P$ (exists by nontriviality). Given input $\langle M, w \rangle$ to the halting problem, construct a machine $M'$ that on input $x$ first simulates $M$ on $w$ . If $M$ halts, then $M'$ simulates $M_P$ on $x$ . If $M$ does not halt, $M'$ loops. Then $L(M')$ has property $P$ if and only if $M$ halts on $w$ . A decider for $P$ would therefore decide HALT, contradicting the halting theorem.

Why It Matters

Rice's theorem is a sweeping impossibility result. It explains why fully automatic program analysis is impossible in general: any question about program behavior (not syntax) that has both yes-instances and no-instances is undecidable. Practical tools (linters, type checkers, model checkers) work by restricting the class of programs or by being sound but incomplete (they may reject valid programs or accept invalid ones).

Failure Mode

Rice's theorem applies only to semantic properties (properties of the language $L(M)$ ). Syntactic properties of the machine description, such as "does the machine have exactly 7 states?" or "does the source code contain the string 'hello'?", are decidable because they do not require running the machine.

Reductions and the Undecidability Zoo

Many-one reductions establish a hierarchy of undecidable problems. If $A \leq_m B$ , then $B$ is at least as hard as $A$ . The standard technique for proving a new problem $B$ undecidable is:

Choose a known undecidable problem $A$ (often HALT).
Construct a computable function $f$ such that $w \in A \iff f(w) \in B$ .
Conclude that $B$ is undecidable.

Example

Undecidability of the emptiness problem

The language $E_{\text{TM}} = \{\langle M \rangle : L(M) = \emptyset\}$ is undecidable. This follows immediately from Rice's theorem (emptiness is a nontrivial semantic property). Alternatively, reduce from HALT directly: given $\langle M, w \rangle$ , construct $M'$ that ignores its input, simulates $M$ on $w$ , and accepts if $M$ halts. Then $L(M') = \emptyset$ iff $M$ does not halt on $w$ .

Connection to Learning Theory

Computability imposes hard limits on learnability. A concept class $\mathcal{C}$ over $\{0,1\}^*$ is PAC-learnable only if there exists an algorithm that, given samples, outputs a hypothesis consistent with the target concept. If evaluating membership in concepts from $\mathcal{C}$ requires solving an undecidable problem, no such algorithm exists.

More precisely: if the concept class consists of all r.e. sets, then even with infinite data, a learner cannot distinguish between a concept that accepts a given string after a very long computation and one that loops forever on that string. Sample complexity bounds become irrelevant when the hypothesis class itself is not computable.

Common Confusions

Watch Out

Computable does not mean efficient

A function is computable if some Turing machine computes it, with no bound on running time. A function is efficiently computable if it can be computed in polynomial time. The class of decidable problems is vastly larger than P. Integer factoring is decidable (try all factors). Whether it is in P is unknown. The P vs NP question lives entirely within the decidable world. Computability theory asks what is solvable at all; complexity theory asks what is solvable quickly.

Watch Out

Semi-decidable is not the same as half-decidable

A semi-decidable language has a TM that says "yes" when the answer is yes but may never respond when the answer is no. It does not mean the TM gets the right answer half the time. Semi-decidability is about completeness (all yes-instances are found) without soundness of rejection (no-instances may never be classified).

Watch Out

The Church-Turing thesis is not a theorem

The Church-Turing thesis cannot be proved because "effectively computable" is an informal notion. It is a claim about the physical world, supported by the equivalence of every proposed model of computation. If someone built a physical device that computed a non-Turing-computable function, the thesis would be falsified. No such device is known or believed to exist.

Exercises

ExerciseCore

Problem

Show that the language $A = \{w : w \text{ is a palindrome over } \{0,1\}\}$ is decidable by describing a Turing machine that decides it.

ExerciseCore

Problem

Prove that if $L$ is decidable, then $\overline{L}$ (the complement of $L$ ) is also decidable.

ExerciseAdvanced

Problem

Prove that the language $\text{EQ}_{\text{TM}} = \{\langle M_1, M_2 \rangle : L(M_1) = L(M_2)\}$ is undecidable using Rice's theorem or by reduction from HALT.

ExerciseAdvanced

Problem

Give an example of a language that is semi-decidable but not decidable, and prove both claims.

ExerciseResearch

Problem

The concept class $\mathcal{C}$ over $\{0,1\}^*$ consists of all decidable languages. Is $\mathcal{C}$ PAC-learnable? Explain why computability, not sample complexity, is the bottleneck.

References

Canonical:

Sipser, Introduction to the Theory of Computation (2013), Chapters 3-5
Rogers, Theory of Recursive Functions and Effective Computability (1987), Chapters 1-7

Additional:

Hopcroft, Motwani, Ullman, Introduction to Automata Theory, Languages, and Computation (2006), Chapter 9
Cutland, Computability: An Introduction to Recursive Function Theory (1980), Chapters 1-4

Connection to ML:

Shalev-Shwartz and Ben-David, Understanding Machine Learning (2014), Chapter 8 (computational complexity of learning)
Ben-David et al., "Learnability can be undecidable" (Nature Machine Intelligence, 2019)

Next Topics

P vs NP: once you know a problem is decidable, the next question is whether it is efficiently decidable
Kolmogorov complexity: an alternative lens on computability that measures the information content of individual strings

Last reviewed: April 2026

Prerequisites

Foundations this topic depends on.

Basic Logic and Proof TechniquesLayer 0A
Sets, Functions, and RelationsLayer 0A

Next Topics

P vs NpContinue →Kolmogorov Complexity and MdlContinue →