Reading Paths
Structured sequences through the material. Each path has a clear starting point, ordered topics, and estimated time. Pick the one that matches what you need to learn.
ML Theory Core
The classical spine: ERM, uniform convergence, VC dimension, Rademacher complexity. Start here if you want to understand why learning from data works.
Concentration Inequalities
From Markov to Matrix Bernstein. The inequality toolkit that every generalization bound, random matrix argument, and stability proof depends on.
Build an LLM from Scratch
Transformer architecture, attention mechanism, positional encoding, KV cache, scaling laws, optimizers, RLHF. The math behind modern language models.
Mathematical Maturity
Measure theory, Radon-Nikodym, convex duality, martingales, information theory. The serious math infrastructure that separates surface-level from real understanding.
Modern Generalization
Where classical theory fails and what replaces it. Implicit bias, double descent, NTK, benign overfitting, scaling laws. The frontier of understanding why deep learning works.
Frontier ML (2025-2026)
Post-training, test-time compute, agents, MoE, Mamba, diffusion, context engineering. The topics that dominate current research and systems work.