Beta. Content is under active construction and has not been peer-reviewed. Report errors on GitHub.Disclaimer

Foundations

Category Theory

Categories, functors, natural transformations, universal properties, adjunctions, and the Yoneda lemma. The language of abstract structure that unifies algebra, topology, logic, and increasingly appears in ML theory.

AdvancedTier 2Stable~55 min
0

Why This Matters

Category theory provides a language for describing structure-preserving maps between mathematical objects. Algebra, topology, logic, and programming language theory all use this language. In ML theory, category theory appears in equivariant neural networks (functors preserving group actions), type-theoretic foundations of programming languages (Cartesian closed categories), and the algebraic structure of probability (the Giry monad).

For most ML practitioners, category theory is not a prerequisite for daily work. But for anyone studying equivariant learning, compositional semantics, or the algebraic foundations of probability, it provides precisely the right abstractions. The Yoneda lemma alone justifies learning the basics: it says that an object is completely determined by how other objects map into it.

Core Definitions

Definition

Category

A category C\mathcal{C} consists of:

  1. A collection Ob(C)\text{Ob}(\mathcal{C}) of objects
  2. For each pair of objects A,BA, B, a collection Hom(A,B)\text{Hom}(A, B) of morphisms (arrows) from AA to BB
  3. For each object AA, an identity morphism idAHom(A,A)\text{id}_A \in \text{Hom}(A, A)
  4. A composition operation: for fHom(A,B)f \in \text{Hom}(A, B) and gHom(B,C)g \in \text{Hom}(B, C), a morphism gfHom(A,C)g \circ f \in \text{Hom}(A, C)

Subject to two axioms:

  • Associativity: h(gf)=(hg)fh \circ (g \circ f) = (h \circ g) \circ f for all composable triples
  • Identity: fidA=ff \circ \text{id}_A = f and idBf=f\text{id}_B \circ f = f for all f:ABf: A \to B
Definition

Functor

A functor F:CDF: \mathcal{C} \to \mathcal{D} between categories assigns:

  1. To each object ACA \in \mathcal{C}, an object F(A)DF(A) \in \mathcal{D}
  2. To each morphism f:ABf: A \to B in C\mathcal{C}, a morphism F(f):F(A)F(B)F(f): F(A) \to F(B) in D\mathcal{D}

Such that:

  • F(idA)=idF(A)F(\text{id}_A) = \text{id}_{F(A)} (preserves identities)
  • F(gf)=F(g)F(f)F(g \circ f) = F(g) \circ F(f) (preserves composition)

A functor is a structure-preserving map between categories.

Definition

Natural Transformation

Given functors F,G:CDF, G: \mathcal{C} \to \mathcal{D}, a natural transformation η:FG\eta: F \Rightarrow G assigns to each object ACA \in \mathcal{C} a morphism ηA:F(A)G(A)\eta_A: F(A) \to G(A) in D\mathcal{D}, such that for every morphism f:ABf: A \to B in C\mathcal{C}, the following square commutes:

G(f)ηA=ηBF(f)G(f) \circ \eta_A = \eta_B \circ F(f)

The morphisms ηA\eta_A are called the components of η\eta. Naturality means the transformation is compatible with all morphisms in the source category.

Standard Examples

The following categories appear throughout mathematics and computer science:

CategoryObjectsMorphisms
SetSetsFunctions
GrpGroupsGroup homomorphisms
Vectk_kVector spaces over kkLinear maps
TopTopological spacesContinuous maps
MeasMeasurable spacesMeasurable functions
PosetElements of a posetaba \leq b gives a unique arrow aba \to b
Example

The forgetful functor

The forgetful functor U:GrpSetU: \textbf{Grp} \to \textbf{Set} sends each group (G,)(G, \cdot) to its underlying set GG, and each group homomorphism to itself (viewed as a function between sets). It "forgets" the group structure. Similarly, U:VectkSetU: \textbf{Vect}_k \to \textbf{Set} forgets the vector space structure.

Example

The free functor

The free functor F:SetGrpF: \textbf{Set} \to \textbf{Grp} sends a set SS to the free group on SS (all finite words in SS1S \cup S^{-1} modulo group axioms). A function f:STf: S \to T extends uniquely to a group homomorphism F(f):F(S)F(T)F(f): F(S) \to F(T). The free functor and the forgetful functor form an adjunction.

Example

Natural transformation: determinant

Consider two functors from the category of commutative rings to the category of groups: the general linear group functor GLn()\text{GL}_n(-) and the units functor ()×(-)^\times. The determinant det:GLn(R)R×\det: \text{GL}_n(R) \to R^\times defines a natural transformation. Naturality means: for any ring homomorphism ϕ:RS\phi: R \to S, applying ϕ\phi entry-wise to a matrix and then taking the determinant equals taking the determinant and then applying ϕ\phi.

Universal Properties

Definition

Universal Property

An object UU in a category has a universal property with respect to a construction if every instance of that construction factors uniquely through UU. Formally, UU is an initial object in a suitable category of solutions. Universal properties determine objects up to unique isomorphism.

Products, coproducts, limits, colimits, free objects, and tensor products are all defined by universal properties. The power of this approach: instead of constructing an object explicitly and then verifying properties, you specify the universal property and deduce that at most one object (up to isomorphism) can satisfy it.

Example

Product as a universal property

The product of objects AA and BB in a category C\mathcal{C} is an object A×BA \times B equipped with projection morphisms π1:A×BA\pi_1: A \times B \to A and π2:A×BB\pi_2: A \times B \to B, such that for any object XX with morphisms f:XAf: X \to A and g:XBg: X \to B, there exists a unique morphism f,g:XA×B\langle f, g \rangle: X \to A \times B with π1f,g=f\pi_1 \circ \langle f, g \rangle = f and π2f,g=g\pi_2 \circ \langle f, g \rangle = g. In Set, this is the Cartesian product. In Grp, it is the direct product. In Top, it is the product topology.

The Yoneda Lemma

The Yoneda lemma is the most important result in basic category theory. It says that an object is completely characterized by its relationships with all other objects.

Definition

Representable Functor

For an object AA in a locally small category C\mathcal{C}, the covariant representable functor Hom(A,):CSet\text{Hom}(A, -): \mathcal{C} \to \textbf{Set} sends each object BB to the set Hom(A,B)\text{Hom}(A, B) and each morphism f:BCf: B \to C to the function f=f:Hom(A,B)Hom(A,C)f_* = f \circ -: \text{Hom}(A, B) \to \text{Hom}(A, C).

Lemma

Yoneda Lemma

Statement

For any functor F:CSetF: \mathcal{C} \to \textbf{Set} and any object ACA \in \mathcal{C}, there is a bijection:

Nat(Hom(A,),F)F(A)\text{Nat}(\text{Hom}(A, -), F) \cong F(A)

that is natural in both AA and FF. Here Nat(Hom(A,),F)\text{Nat}(\text{Hom}(A, -), F) denotes the set of natural transformations from the representable functor Hom(A,)\text{Hom}(A, -) to FF.

Intuition

A natural transformation η:Hom(A,)F\eta: \text{Hom}(A, -) \Rightarrow F is completely determined by ηA(idA)F(A)\eta_A(\text{id}_A) \in F(A). Given any element xF(A)x \in F(A), you can reconstruct the entire natural transformation by defining ηB(f)=F(f)(x)\eta_B(f) = F(f)(x) for each f:ABf: A \to B. Naturality forces this to be the only possibility. So the "space of natural transformations out of a representable" is just the set F(A)F(A).

Proof Sketch

Define the bijection explicitly. Given a natural transformation η:Hom(A,)F\eta: \text{Hom}(A, -) \Rightarrow F, map it to ηA(idA)F(A)\eta_A(\text{id}_A) \in F(A). Conversely, given xF(A)x \in F(A), define ηB(f)=F(f)(x)\eta_B(f) = F(f)(x) for each fHom(A,B)f \in \text{Hom}(A, B).

To show η\eta is natural: for any g:BCg: B \to C, we need F(g)ηB=ηCgF(g) \circ \eta_B = \eta_C \circ g_*. Evaluating at f:ABf: A \to B: F(g)(ηB(f))=F(g)(F(f)(x))=F(gf)(x)=ηC(gf)=ηC(g(f))F(g)(\eta_B(f)) = F(g)(F(f)(x)) = F(g \circ f)(x) = \eta_C(g \circ f) = \eta_C(g_*(f)). Functoriality of FF gives the third equality.

To show this is a bijection: if η\eta is any natural transformation, then naturality forces ηB(f)=F(f)(ηA(idA))\eta_B(f) = F(f)(\eta_A(\text{id}_A)), so η\eta is determined by ηA(idA)\eta_A(\text{id}_A).

Why It Matters

The Yoneda lemma has a corollary called the Yoneda embedding: the functor AHom(A,)A \mapsto \text{Hom}(A, -) is a full and faithful embedding of C\mathcal{C} into the functor category [Cop,Set][\mathcal{C}^{op}, \textbf{Set}]. This means every category embeds into a category of "generalized sets." Two objects AA and BB are isomorphic if and only if Hom(A,)\text{Hom}(A, -) and Hom(B,)\text{Hom}(B, -) are naturally isomorphic: an object is determined by its morphisms.

Failure Mode

The Yoneda lemma requires a locally small category (hom-sets are actual sets, not proper classes). For large categories where hom-collections are proper classes, the statement must be reformulated using universe enlargement or other size-management techniques. In practice, all categories arising in applications are locally small.

Adjunctions

Definition

Adjunction

An adjunction between categories C\mathcal{C} and D\mathcal{D} consists of functors F:CDF: \mathcal{C} \to \mathcal{D} (the left adjoint) and G:DCG: \mathcal{D} \to \mathcal{C} (the right adjoint) together with a natural bijection:

HomD(F(A),B)HomC(A,G(B))\text{Hom}_{\mathcal{D}}(F(A), B) \cong \text{Hom}_{\mathcal{C}}(A, G(B))

for all objects ACA \in \mathcal{C} and BDB \in \mathcal{D}. We write FGF \dashv G.

Adjunctions are everywhere:

Left adjoint FFRight adjoint GGSetting
Free groupForgetful functorGrp \leftrightarrow Set
×A- \times AHom(A,)\text{Hom}(A, -)Set (currying)
Free vector spaceForgetful functorVectk_k \leftrightarrow Set
Σ\Sigma (existential)PullbackDependent type theory

The currying adjunction Hom(X×A,B)Hom(X,BA)\text{Hom}(X \times A, B) \cong \text{Hom}(X, B^A) is the categorical statement that a function of two arguments is the same as a function returning a function. This is the foundation of the Curry-Howard-Lambek correspondence connecting logic, computation, and category theory.

When Category Theory Matters (and When It Does Not)

For ML practitioners, category theory is genuinely useful in these specific areas:

  • Equivariant neural networks: a GG-equivariant map is precisely a natural transformation between certain functors. The classification of equivariant linear maps uses Schur's lemma, which is cleanly stated categorically.
  • Type theory and programming languages: Cartesian closed categories model the simply typed lambda calculus. Monads (a categorical concept) structure effects in functional programming.
  • Compositional semantics: DisCoCat models in NLP use monoidal functors to map grammatical structure to vector space computations.
  • Algebraic probability: the Giry monad gives a categorical framework for probability measures.

For most applied ML work (training models, tuning hyperparameters, engineering features), category theory adds no practical value. The honest assessment: learn it if you work on the topics above or if you want the conceptual unification it provides. Skip it if your work is empirical.

Common Confusions

Watch Out

Categories are not just sets with extra structure

A common first impression is that a category is "a set of objects with arrows." But the objects of a category need not form a set (they can form a proper class, as in Set itself). The morphisms carry the real information: two categories with the same objects but different morphisms are completely different. The arrows, not the objects, are what matter.

Watch Out

Functors preserve structure; they do not create it

A functor F:CDF: \mathcal{C} \to \mathcal{D} must preserve composition and identities. It cannot "add" relationships that do not exist in the source. If there is no morphism from AA to BB in C\mathcal{C}, the functor says nothing about the relationship between F(A)F(A) and F(B)F(B) beyond what is implied by the existing morphisms.

Watch Out

Isomorphism in a category is not equality

Two objects AA and BB are isomorphic (ABA \cong B) if there exist morphisms f:ABf: A \to B and g:BAg: B \to A with gf=idAg \circ f = \text{id}_A and fg=idBf \circ g = \text{id}_B. In Set, this means a bijection. In Top, a homeomorphism. Category theory works up to isomorphism, not equality. The principle of equivalence: no categorical construction should distinguish between isomorphic objects.

Exercises

ExerciseCore

Problem

Verify that vector spaces over a field kk and linear maps form a category Vectk_k. Specifically, check that composition of linear maps is linear and that the identity map is linear.

ExerciseAdvanced

Problem

Prove the Yoneda lemma for the special case F=Hom(B,)F = \text{Hom}(B, -). That is, show:

Nat(Hom(A,),Hom(B,))Hom(B,A)\text{Nat}(\text{Hom}(A, -), \text{Hom}(B, -)) \cong \text{Hom}(B, A)

and explain why this means the Yoneda embedding is full and faithful.

ExerciseAdvanced

Problem

Show that the free-forgetful adjunction FUF \dashv U between Grp and Set satisfies HomGrp(F(S),G)HomSet(S,U(G))\text{Hom}_{\textbf{Grp}}(F(S), G) \cong \text{Hom}_{\textbf{Set}}(S, U(G)) by describing both sides explicitly when S={a,b}S = \{a, b\} and G=ZG = \mathbb{Z}.

ExerciseResearch

Problem

A GG-equivariant neural network layer is a map ϕ:VW\phi: V \to W between GG-representations satisfying ϕ(ρV(g)x)=ρW(g)ϕ(x)\phi(\rho_V(g) \cdot x) = \rho_W(g) \cdot \phi(x) for all gGg \in G. Express this condition as a natural transformation between appropriate functors. What does the Yoneda perspective tell you about classifying such maps?

References

Canonical:

  • Mac Lane, Categories for the Working Mathematician (1998), Chapters 1-4 for categories, functors, natural transformations, and adjunctions
  • Awodey, Category Theory (2010), Chapters 1-8 for a modern introduction at the graduate level
  • Riehl, Category Theory in Context (2016), Chapters 1-6, particularly Chapter 2 for the Yoneda lemma

Accessible:

  • Leinster, Basic Category Theory (2014), Chapters 1-4 for a concise treatment aimed at beginners
  • Fong & Spivak, An Invitation to Applied Category Theory (2019), for applications-oriented readers

ML connections:

  • Shiebler, Gavranovic, Wilson, "Category Theory in Machine Learning" (2021 survey), Sections 2-5 on equivariant networks and compositional models

Next Topics

  • Sets, functions, and relations: the prerequisite set theory that categories generalize
  • Dependent type theory: where Cartesian closed categories meet proof theory

Last reviewed: April 2026

Prerequisites

Foundations this topic depends on.