Beta. Content is under active construction and has not been peer-reviewed. Report errors on GitHub.Disclaimer

Foundations

Cantor's Theorem and Uncountability

Cantor's diagonal argument proves the reals are uncountable. The power set of any set has strictly greater cardinality. These results are the origin of the distinction between countable and uncountable infinity.

CoreTier 2Stable~45 min
0

Why This Matters

Not all infinite sets are the same size. The natural numbers, the integers, and the rationals are all countable: you can list them. The real numbers are not. This distinction is not a curiosity. It is the reason that complexity measures like VC dimension exist.

The set of all computer programs is countable (they are finite strings over a finite alphabet). The set of all functions from R\mathbb{R} to {0,1}\{0, 1\} is uncountable. So most functions cannot be computed by any program. In ML terms: the hypothesis class of all possible classifiers is uncountably infinite, which is why you must restrict to a learnable subclass and measure its complexity.

Core Definitions

Definition

Countable Set

A set AA is countable if there exists an injection f:ANf: A \to \mathbb{N}. Equivalently, AA is countable if it is finite or if there exists a bijection f:ANf: A \to \mathbb{N} (countably infinite).

Definition

Uncountable Set

A set AA is uncountable if it is infinite and there is no bijection between AA and N\mathbb{N}. No listing a1,a2,a3,a_1, a_2, a_3, \ldots can exhaust AA.

Definition

Cardinality

Two sets AA and BB have the same cardinality, written A=B|A| = |B|, if there exists a bijection f:ABf: A \to B. We write AB|A| \leq |B| if there exists an injection from AA to BB. The Cantor-Bernstein theorem says: if AB|A| \leq |B| and BA|B| \leq |A|, then A=B|A| = |B|.

Definition

Power Set

The power set of AA, written P(A)\mathcal{P}(A), is the set of all subsets of AA:

P(A)={B:BA}\mathcal{P}(A) = \{B : B \subseteq A\}

For finite AA with A=n|A| = n, we have P(A)=2n|\mathcal{P}(A)| = 2^n.

Main Theorems

Theorem

Uncountability of the Reals

Statement

The set R\mathbb{R} of real numbers is uncountable. More precisely, the interval [0,1][0, 1] is uncountable.

Intuition

No matter how you try to list the reals in [0,1][0, 1] as a sequence r1,r2,r3,r_1, r_2, r_3, \ldots, there is always a real number your list misses. The diagonal argument constructs this missing number explicitly.

Proof Sketch

Assume for contradiction that [0,1][0, 1] is countable. List all elements as r1,r2,r3,r_1, r_2, r_3, \ldots, writing each in decimal as ri=0.di1di2di3r_i = 0.d_{i1}d_{i2}d_{i3}\ldots. Construct a new number x=0.x1x2x3x = 0.x_1 x_2 x_3 \ldots where xidiix_i \neq d_{ii} (and xi{0,9}x_i \notin \{0, 9\} to avoid issues with non-unique representations). Then x[0,1]x \in [0, 1] but xrix \neq r_i for every ii, since xx and rir_i differ in the ii-th decimal digit. This contradicts the assumption that the list contains all of [0,1][0, 1].

Why It Matters

This is the first proof that infinity is not a single concept. There are strictly more real numbers than natural numbers. The argument pattern (diagonalization) reappears throughout mathematics and computer science: the halting problem, Godel's incompleteness theorems, and the existence of non-computable functions all use variants of it.

Failure Mode

The proof does not work for the rationals. Every rational has a terminating or repeating decimal, and the number constructed by the diagonal argument is generically irrational. The rationals are countable despite being dense in R\mathbb{R}.

Theorem

Cantor's Theorem

Statement

For any set AA, there is no surjection from AA onto P(A)\mathcal{P}(A). Therefore A<P(A)|A| < |\mathcal{P}(A)|: the power set is strictly larger.

Intuition

You cannot pair up elements of AA with subsets of AA so that every subset gets paired. There are always leftover subsets. The proof constructs one such leftover subset explicitly using diagonalization.

Proof Sketch

Let f:AP(A)f: A \to \mathcal{P}(A) be any function. Define D={aA:af(a)}D = \{a \in A : a \notin f(a)\}. Claim: DD is not in the range of ff. If D=f(b)D = f(b) for some bb, then bD    bf(b)=Db \in D \iff b \notin f(b) = D, a contradiction. So ff is not surjective. Since the injection a{a}a \mapsto \{a\} shows AP(A)|A| \leq |\mathcal{P}(A)|, we conclude A<P(A)|A| < |\mathcal{P}(A)|.

Why It Matters

Cantor's theorem guarantees an infinite hierarchy of infinite cardinalities: N<P(N)<P(P(N))<|\mathbb{N}| < |\mathcal{P}(\mathbb{N})| < |\mathcal{P}(\mathcal{P}(\mathbb{N}))| < \cdots. There is no largest infinity. This is also the template for Russell's paradox (consider the set of all sets that do not contain themselves), which motivated axiomatic set theory.

Failure Mode

The theorem requires the axioms of separation (to form the set DD) and power set (for P(A)\mathcal{P}(A) to exist). In weaker set theories or constructive settings, the precise formulation matters.

Connection to Computability

The set of all Turing machines (equivalently, all programs) is countable: each program is a finite string, and the set of finite strings over any finite alphabet is countable. But the set of all functions f:N{0,1}f: \mathbb{N} \to \{0, 1\} has cardinality 2N=P(N)2^{|\mathbb{N}|} = |\mathcal{P}(\mathbb{N})|, which by Cantor's theorem is uncountable.

Therefore: most functions from N\mathbb{N} to {0,1}\{0, 1\} are not computable. No program computes them. This is not a conjecture; it follows directly from a counting argument.

Connection to ML

The hypothesis class of all binary classifiers on Rd\mathbb{R}^d (all functions h:Rd{0,1}h: \mathbb{R}^d \to \{0, 1\}) has cardinality 2Rd2^{|\mathbb{R}^d|}, which is uncountably infinite. The ERM generalization bound for finite hypothesis classes gives a gap of logH/n\sqrt{\log|\mathcal{H}| / n}, which is infinite for uncountable H\mathcal{H}.

This is why complexity measures exist. VC dimension, Rademacher complexity, and covering numbers are all ways to assign a finite "effective size" to an infinite hypothesis class, making generalization bounds non-vacuous.

Common Confusions

Watch Out

The rationals are dense but countable

Density and countability are independent properties. The rationals are dense in R\mathbb{R} (between any two reals there is a rational), yet Q\mathbb{Q} is countable. Density is a topological property; countability is a set-theoretic one.

Watch Out

Cantor's theorem does not say which cardinality the reals have

Cantor proved R>N|\mathbb{R}| > |\mathbb{N}|, but did not determine whether R=1|\mathbb{R}| = \aleph_1 (the next cardinal after 0=N\aleph_0 = |\mathbb{N}|). The claim R=1|\mathbb{R}| = \aleph_1 is the Continuum Hypothesis, which is independent of ZFC. It can be neither proved nor disproved from the standard axioms.

Exercises

ExerciseCore

Problem

Prove that the set of all finite binary strings {0,1}\{0, 1\}^* is countable.

ExerciseCore

Problem

Using Cantor's theorem, prove that there exists a function f:N{0,1}f: \mathbb{N} \to \{0, 1\} that is not computable by any Turing machine.

ExerciseAdvanced

Problem

Let H\mathcal{H} be the set of all binary classifiers on [0,1][0,1]. Show that H|\mathcal{H}| is uncountable and explain why the finite-class ERM bound does not apply.

References

Canonical:

  • Halmos, Naive Set Theory (1960), Sections 1-4 and 25
  • Enderton, Elements of Set Theory (1977), Chapter 6

Connection to CS:

  • Sipser, Introduction to the Theory of Computation (2013), Section 4.2

Connection to ML:

  • Shalev-Shwartz & Ben-David, Understanding Machine Learning (2014), Chapter 2

  • Munkres, Topology (2000), Chapter 1 (set theory review)

Next Topics

Last reviewed: April 2026

Builds on This

Next Topics