Shareable map · Bookmark this page

ML Theory Roadmap

The whole curriculum on one page, from measure-theoretic foundations through modern deep learning and the research frontier. Tier-1 landmarks are the 198 core pages worth reading first.

Interactive graph Curriculum index →Find your gaps →

Layer 0A · Axioms

60 topics

Sets, functions, logic, linear algebra, real analysis, measure-theoretic basics.

Foundations

Mathematical Infrastructure

Numerical Optimization

●Floating-Point Arithmetic

Algorithms Foundations

Layer 0B · Infrastructure

26 topics

Measure theory, functional analysis, convex duality, numerical foundations.

Foundations

Mathematical Infrastructure

Statistical Estimation

Infrastructure

WebGPU for Machine Learning

Layer 1 · Core Tools

75 topics

Concentration, estimation, information theory, optimization primitives, CLT.

Foundations

Mathematical Infrastructure

Concentration Probability

Statistical Foundations

Order Statistics

Statistical Estimation

Numerical Optimization

Optimization Function Classes

Algorithms Foundations

Learning Theory Core

ML Methods

Sampling MCMC

Rejection Sampling

Methodology

Scientific ML

●Classical ODEs: Existence, Stability, and Numerical Methods

Statistics

Layer 2 · Learning Theory

163 topics

ERM, VC, Rademacher, PAC, stability, kernels, uniform convergence.

Foundations

Mathematical Infrastructure

Concentration Probability

Statistical Foundations

Statistical Estimation

Decision Theory

Numerical Optimization

Optimization Function Classes

Algorithms Foundations

Learning Theory Core

ML Methods

Sampling MCMC

Training Techniques

Methodology

LLM Construction

●Linear Layer: Shapes, Bias, and Memory

RL Theory

Applied Math

Applied Statistics

●Non-Probability Sampling

Learning Theory

Predictive Uncertainty

●Split Conformal Prediction

Sequential Inference

Layer 3 · ML Methods

152 topics

Regression, SVMs, neural nets, optimization, regularization, NTK.

Mathematical Infrastructure

Concentration Probability

Statistical Foundations

Decision Theory

Numerical Optimization

Optimization Function Classes

Algorithms Foundations

Learning Theory Core

Modern Generalization

ML Methods

Sampling MCMC

Training Techniques

Methodology

LLM Construction

RL Theory

AI Safety

Applied Math

Scientific ML

ML Applications

Optimization

SGD as a Stochastic Differential Equation

Predictive Uncertainty

●Weighted Conformal Prediction Under Covariate Shift

Sequential Inference

Layer 4 · Deep Learning

113 topics

Transformers, attention, training dynamics, double descent, scaling.

Statistical Foundations

Random Matrix Theory Overview

Modern Generalization

Methodology

LLM Construction

RL Theory

Beyond LLMS

AI Safety

Model Timeline

Applied Math

Scientific ML

Applied ML

Formal Verification

●AlphaProof and AI-Assisted Theorem Proving

Infrastructure

Layer 5 · Frontier

66 topics

RLHF, alignment, interpretability, reasoning, agents, scaling laws.

Modern Generalization

Open Problems in ML Theory

Methodology

Energy Efficiency and Green AI

LLM Construction

RL Theory

Agentic RL and Tool Use

Beyond LLMS

AI Safety

Model Timeline

AI History

History of Artificial Intelligence

How to use this map

● Amber dots are tier-1 landmarks. Read these first.
Each page links down to its prerequisites and up to what builds on it. No concept floats without grounding.
Use the gap finder to pick a destination and get a BFS-ordered reading list.
The interactive graph gives you the same graph with click-to-explore and path tracing.

Planned additions

Topics in progress, primarily AI safety and alignment.

Scalable oversight. Bowman et al. 2022, debate and market-based precedents, sandwiching experiments. Scope conditions matter: what the setup can and cannot tell us.
Deceptive alignment. Hubinger et al. 2019/2021 mesa-optimizer framing. Separate the empirical evidence from the philosophical argument.
Alignment faking. Greenblatt et al. 2024 (Anthropic). Include the limitations section explicitly.
DPO. Currently folded into dpo-vs-grpo. Deserves its own page: Rafailov et al. 2023, the implicit-reward view, and the overoptimization story. Follow-up on IPO, KTO, SimPO and the broader DPO family.
Verifiable-reward RL (RLVR). Reasoning training with programmatically checkable rewards: math graders, code executors, proof verifiers. Scope what verifiers can and cannot certify, and the reward-hacking surface when the verifier is imperfect. Needs careful separation from general RLHF.
Inference-time scaling beyond CoT. Budgeted search, verifier-guided decoding, reward-model reranking, parallel sampling with aggregation. Current inference-time-scaling-laws page covers the scaling story; deserves a systems-level companion on how the compute is actually spent.
Agent systems as systems. Long-horizon tool use, failure recovery, memory design, evaluation under distribution shift, benchmark contamination. Current agent pages cover the components; a systems-view page on how they compose and fail in production is missing.
Weak-to-strong generalization. Burns et al. 2023 (OpenAI). What the setup can and cannot tell us about alignment at scale.
Instrumental convergence. Omohundro, Bostrom framings. Flag explicitly where the philosophical argument outruns the empirical support.
Jailbreaks. Attack taxonomy, measurement difficulties, why robust alignment is not a solved problem. Needs honest threat-model scoping, not incident anecdotes.
Superposition. Elhage et al. 2022 toy-models paper, the interference vs capacity trade-off, and the connection to sparse autoencoders.

Layer 0A · Axioms

Foundations

Mathematical Infrastructure

Numerical Optimization

Algorithms Foundations

Layer 0B · Infrastructure

Foundations

Mathematical Infrastructure

Statistical Estimation

Infrastructure

Layer 1 · Core Tools

Foundations

Mathematical Infrastructure

Concentration Probability

Statistical Foundations

Statistical Estimation

Numerical Optimization

Optimization Function Classes

Algorithms Foundations

Learning Theory Core

ML Methods

Sampling MCMC

Methodology

Scientific ML

Statistics

Layer 2 · Learning Theory

Foundations

Mathematical Infrastructure

Concentration Probability

Statistical Foundations

Statistical Estimation

Decision Theory

Numerical Optimization

Optimization Function Classes

Algorithms Foundations

Learning Theory Core

ML Methods

Sampling MCMC

Training Techniques

Methodology

LLM Construction

RL Theory

Applied Math

Applied Statistics

Learning Theory

Predictive Uncertainty

Sequential Inference

Layer 3 · ML Methods

Mathematical Infrastructure

Concentration Probability

Statistical Foundations

Decision Theory

Numerical Optimization

Optimization Function Classes

Algorithms Foundations

Learning Theory Core

Modern Generalization

ML Methods

Sampling MCMC

Training Techniques

Methodology

LLM Construction

RL Theory

AI Safety

Applied Math

Scientific ML

Bayesian ML Frontier

Causal Semiparametric

Learning Theory

ML Applications

Optimization

Predictive Uncertainty

Sequential Inference

Layer 4 · Deep Learning

Statistical Foundations

Modern Generalization

Methodology

LLM Construction

RL Theory

Beyond LLMS