Prerequisite chain

Prerequisites for Reinforcement Learning from Human Feedback

Topics you need before working through Reinforcement Learning from Human Feedback. Direct prerequisites are listed first; transitive prerequisites (the chain reachable through them) follow.

Direct prerequisites (4)

Policy Gradient Theoremlayer 3, tier 1
RLHF and Alignmentlayer 4, tier 2
Reinforcement Learning for Synthesis Planninglayer 4, tier 3
Reward Design and Reward Misspecificationlayer 3, tier 1

Reachable through the chain (258)

These topics are not directly cited as prerequisites but are reached transitively by following the chain upward. Working through the direct prerequisites pulls these in.

Markov Decision Processeslayer 2, tier 1
Convex Optimization Basicslayer 1, tier 1
Differentiation in Rⁿlayer 0A, tier 1
Sets, Functions, and Relationslayer 0A, tier 1
Basic Logic and Proof Techniqueslayer 0A, tier 2
Vectors, Matrices, and Linear Mapslayer 0A, tier 1
Continuity in Rⁿlayer 0A, tier 1
Metric Spaces, Convergence, and Completenesslayer 0A, tier 1
Matrix Operations and Propertieslayer 0A, tier 1
Linear Independencelayer 0A, tier 1
Common Inequalitieslayer 0A, tier 1
Common Probability Distributionslayer 0A, tier 1
Exponential Function Propertieslayer 0A, tier 1
Integration and Change of Variableslayer 0A, tier 2
Measure-Theoretic Probabilitylayer 0B, tier 1
Cardinality and Countabilitylayer 0A, tier 2
Kolmogorov Probability Axiomslayer 0A, tier 1
Random Variableslayer 0A, tier 1
Zermelo-Fraenkel Set Theorylayer 0A, tier 2
Dynamic Programminglayer 0A, tier 1
Graph Algorithms Essentialslayer 0A, tier 2
Greedy Algorithmslayer 0A, tier 2
Inverse and Implicit Function Theoremlayer 0A, tier 2
The Jacobian Matrixlayer 0A, tier 1
Positive Semidefinite Matriceslayer 0A, tier 1
Eigenvalues and Eigenvectorslayer 0A, tier 1
Inner Product Spaces and Orthogonalitylayer 0A, tier 1
Matrix Normslayer 0A, tier 1
Taylor Expansionlayer 0A, tier 1
The Hessian Matrixlayer 0A, tier 1
Vector Calculus Chain Rulelayer 0A, tier 1
Concentration Inequalitieslayer 1, tier 1
Expectation, Variance, Covariance, and Momentslayer 0A, tier 1
Joint, Marginal, and Conditional Distributionslayer 0A, tier 1
Triangular Distributionlayer 0A, tier 2
Central Limit Theoremlayer 0B, tier 1
Law of Large Numberslayer 0B, tier 1
Borel-Cantelli Lemmaslayer 0B, tier 1
Modes of Convergence of Random Variableslayer 0B, tier 1
Characteristic Functionslayer 1, tier 1
Moment Generating Functionslayer 0A, tier 2
Martingale Theorylayer 0B, tier 2
Radon-Nikodym and Conditional Expectationlayer 0B, tier 1
Skewness, Kurtosis, and Higher Momentslayer 1, tier 1
Bayesian State Estimationlayer 2, tier 2
Bayesian Estimationlayer 0B, tier 2
Maximum Likelihood Estimation: Theory, Information Identity, and Asymptotic Efficiencylayer 0B, tier 1
KL Divergencelayer 1, tier 1
Information Theory Foundationslayer 0B, tier 2
Distance Metrics Comparedlayer 1, tier 2
Non-Euclidean and Hyperbolic Geometrylayer 1, tier 2
Total Variation Distancelayer 1, tier 1
Method of Momentslayer 0B, tier 2
Shrinkage Estimation and the James-Stein Estimator: Inadmissibility, SURE, and Brown's Characterizationlayer 0B, tier 1
Cramér-Rao Bound: Information Inequality, Achievability, and Sharper Variantslayer 0B, tier 1
Fisher Information: Curvature, KL Geometry, and the Natural Gradientlayer 0B, tier 1
Basu's Theoremlayer 0B, tier 3
Sufficient Statistics and Exponential Familieslayer 0B, tier 2
Minimax Lower Bounds: Le Cam, Fano, Assouad, and the Reduction to Testinglayer 3, tier 1
Empirical Processes and Chaininglayer 3, tier 2
Rademacher Complexitylayer 3, tier 1
Empirical Risk Minimizationlayer 2, tier 1
High-Dimensional Probability (Vershynin)layer 2, tier 1
Cramér-Wold Theoremlayer 1, tier 2
Loss Functions Cataloglayer 1, tier 1
Logistic Regressionlayer 1, tier 1
Data Preprocessing and Feature Engineeringlayer 1, tier 1
Linear Regressionlayer 1, tier 1
The Elements of Statistical Learning (Hastie, Tibshirani, Friedman)layer 0B, tier 1
Naive Bayeslayer 1, tier 2
Robust Statistics and M-Estimatorslayer 3, tier 2
Minimax and Saddle Pointslayer 2, tier 2
Convex Dualitylayer 2, tier 1
Subgradients and Subdifferentialslayer 1, tier 1
Winsorizationlayer 1, tier 3
Order Statisticslayer 1, tier 2
Sequences and Series of Functionslayer 0A, tier 2
Understanding Machine Learning (Shalev-Shwartz, Ben-David)layer 1, tier 1
VC Dimensionlayer 2, tier 1
Counting and Combinatoricslayer 0A, tier 2
Hypothesis Classes and Function Spaceslayer 2, tier 1
PAC Learning Frameworklayer 1, tier 1
Uniform Convergencelayer 2, tier 1
Adaptive Learning Is Not IIDlayer 3, tier 2
Bernstein Inequalitylayer 2, tier 1
Bennett's Inequalitylayer 2, tier 1
Chernoff Boundslayer 1, tier 1
Hoeffding's Lemmalayer 1, tier 1
Realizability Assumptionlayer 2, tier 1
Loss Functionslayer 1, tier 2
Slud's Inequalitylayer 2, tier 2
Bias-Complexity Tradeofflayer 2, tier 2
No-Free-Lunch Theoremlayer 2, tier 2
Glivenko-Cantelli Theoremlayer 2, tier 2
McDiarmid's Inequalitylayer 3, tier 1
Sub-Gaussian Random Variableslayer 2, tier 1
Epsilon-Nets and Covering Numberslayer 3, tier 1
Contraction Inequalitylayer 3, tier 2
Sub-Exponential Random Variableslayer 2, tier 1
Chi-Squared Concentrationlayer 2, tier 1
Symmetrization Inequalitylayer 3, tier 1
Asymptotic Statistics: M-Estimators, Delta Method, LANlayer 0B, tier 1
Measure Concentration and Geometric Functional Analysislayer 3, tier 1
Stochastic Processes for MLlayer 2, tier 2
Gaussian Processes in Astronomylayer 4, tier 3
Gaussian Processes for Machine Learninglayer 4, tier 3
Kernels and Reproducing Kernel Hilbert Spaceslayer 3, tier 2
Dimensionality Reduction Theorylayer 2, tier 2
Principal Component Analysislayer 1, tier 1
Singular Value Decompositionlayer 0A, tier 1
Gram Matrices and Kernel Matriceslayer 1, tier 1
Matrix Multiplication Algorithmslayer 1, tier 2
The Kernel Tricklayer 2, tier 1
Support Vector Machineslayer 2, tier 1
Perceptronlayer 1, tier 2
Ridge Regressionlayer 1, tier 1
Gauss-Markov Theoremlayer 2, tier 1
The Multivariate Normal Distributionlayer 0B, tier 1
Maximum A Posteriori (MAP) Estimationlayer 0B, tier 1
Bayesian Linear Regressionlayer 2, tier 1
Conjugate Priorslayer 0B, tier 1
High-Dimensional Covariance Estimationlayer 3, tier 2
Matrix Concentrationlayer 3, tier 1
Lasso Regressionlayer 2, tier 1
NMF (Nonnegative Matrix Factorization)layer 2, tier 3
Tensors and Tensor Operationslayer 0A, tier 1
Pandas and NumPy Fundamentalslayer 4, tier 3
Functional Analysis Corelayer 0B, tier 2
Hanson-Wright Inequalitylayer 3, tier 2
Regularization Theorylayer 2, tier 2
Bias-Variance Tradeofflayer 2, tier 2
Elastic Netlayer 2, tier 2
Generalized Additive Modelslayer 2, tier 2
MARS (Multivariate Adaptive Regression Splines)layer 2, tier 3
K-Nearest Neighborslayer 1, tier 2
AdaBoostlayer 2, tier 2
Decision Trees and Ensembleslayer 2, tier 2
Gradient Boostinglayer 2, tier 1
Gradient Descent Variantslayer 1, tier 1
Cubist and Model Treeslayer 2, tier 3
Overfitting and Underfittinglayer 2, tier 1
XGBoostlayer 2, tier 2
Spectral Clusteringlayer 2, tier 2
K-Means Clusteringlayer 1, tier 1
Self-Organizing Mapslayer 2, tier 3
t-SNE and UMAPlayer 2, tier 2
PageRank Algorithmlayer 2, tier 2
SVM for RF Classificationlayer 4, tier 3
Signals and Systems for MLlayer 1, tier 2
Time Series Forecasting Basicslayer 2, tier 2
Time Series Foundationslayer 2, tier 2
Gaussian Process Regressionlayer 3, tier 2
Kernel Methods for Moleculeslayer 4, tier 3
Kalman Filterlayer 2, tier 1
No-U-Turn Sampler and Neal's Funnellayer 3, tier 2
Hamiltonian Monte Carlolayer 3, tier 2
Metropolis-Hastings Algorithmlayer 2, tier 1
Markov Chain Monte Carlolayer 2, tier 1
Markov Chains and Steady Statelayer 1, tier 2
Monte Carlo Methodslayer 2, tier 1
Gibbs Samplinglayer 2, tier 1
Griddy Gibbs Samplinglayer 2, tier 3
Variance Reduction Techniqueslayer 2, tier 2
Importance Samplinglayer 2, tier 1
Number Theory and Machine Learninglayer 4, tier 3
Differential Privacylayer 3, tier 2
Federated Learninglayer 3, tier 2
Optimizer Theory: SGD, Adam, and Muonlayer 3, tier 1
Adam Optimizerlayer 2, tier 1
Stochastic Gradient Descent Convergencelayer 2, tier 1
Coordinate Descentlayer 2, tier 2
Mirror Descent and Frank-Wolfelayer 3, tier 2
Online Convex Optimizationlayer 3, tier 2
No-Regret Learninglayer 3, tier 2
Projected Gradient Descentlayer 2, tier 2
Proximal Gradient Methodslayer 2, tier 1
Quasi-Newton Methodslayer 2, tier 1
Newton's Methodlayer 1, tier 1
Line Search Methodslayer 2, tier 2
Secant Methodlayer 1, tier 2
Automatic Differentiationlayer 1, tier 1
Matrix Calculuslayer 1, tier 1
Information Geometrylayer 3, tier 3
Whitening and Decorrelationlayer 2, tier 2
Floating-Point Arithmeticlayer 0A, tier 1
Preconditioned Optimizers: Shampoo, K-FAC, and Natural Gradientlayer 3, tier 2
Conjugate Gradient Methodslayer 2, tier 2
Numerical Linear Algebralayer 1, tier 2
Riemannian Optimization and Manifold Constraintslayer 3, tier 2
Equivariant Deep Learninglayer 4, tier 2
Convolutional Neural Networkslayer 3, tier 2
Feedforward Networks and Backpropagationlayer 2, tier 1
Activation Functionslayer 1, tier 1
Deep Learning (Goodfellow, Bengio, Courville)layer 0B, tier 1
Fast Fourier Transformlayer 1, tier 2
Complex Numbers for Fourierlayer 0A, tier 2
Skip Connections and ResNetslayer 2, tier 1
Graph Neural Networkslayer 3, tier 2
Clustering for Gene Expressionlayer 4, tier 3
Attention for Protein Structure: AlphaFold and Successorslayer 4, tier 3
Attention Mechanism Theorylayer 4, tier 2
Softmax and Numerical Stabilitylayer 1, tier 1
Linear Layer: Shapes, Bias, and Memorylayer 2, tier 1
Word Embeddingslayer 2, tier 2
Information Retrieval Foundationslayer 2, tier 1
Transformer Architecturelayer 4, tier 2
Attention Mechanisms Historylayer 3, tier 2
Recurrent Neural Networkslayer 3, tier 2
Macroeconomic Time-Series Forecastinglayer 4, tier 3
Byte-Level Language Modelslayer 4, tier 3
Tokenization and Information Theorylayer 4, tier 3
Distributional Semanticslayer 2, tier 2
NLP for Economic Text Analysislayer 4, tier 3
Natural Language Processing Foundationslayer 2, tier 2
RNNs for Signal Sequenceslayer 4, tier 3
Token Prediction and Language Modelinglayer 3, tier 2
Hyperbolic Embeddings for Graphslayer 2, tier 2
Training Dynamics and Loss Landscapeslayer 4, tier 2
Stability and Optimization Dynamicslayer 2, tier 2
Peano Axiomslayer 0A, tier 2
Rejection Samplinglayer 1, tier 2
Squeezed Rejection Samplinglayer 2, tier 3
Burn-in and Convergence Diagnosticslayer 2, tier 2
Coupling Arguments and Mixing Timelayer 3, tier 3
MCMC for Markov Random Fieldslayer 3, tier 3
Perfect Samplinglayer 3, tier 3
Slice Samplinglayer 2, tier 3
Multi-Armed Bandits Theorylayer 2, tier 2
Bayesian Optimization for Hyperparameterslayer 3, tier 2
Online Learning and Banditslayer 3, tier 2
Test-Time Training and Adaptive Inferencelayer 5, tier 2
Continuous Thought Machineslayer 5, tier 3
Neural ODEs and Continuous-Depth Networkslayer 4, tier 3
Classical ODEs: Existence, Stability, and Numerical Methodslayer 1, tier 1
Gradient Flow and Vanishing Gradientslayer 2, tier 1
Equilibrium and Implicit-Layer Modelslayer 4, tier 2
Implicit Differentiationlayer 2, tier 2
Lyapunov-Based Machine Learning for Chaoslayer 4, tier 3
Nonlinear Dynamics and Chaos Fundamentalslayer 4, tier 3
Physics-Informed Neural Networkslayer 4, tier 2
Divergence, Curl, and Line Integralslayer 0A, tier 2
Kolmogorov-Arnold Networks (KANs)layer 4, tier 2
Universal Approximation Theoremlayer 2, tier 1
PDE Fundamentals for Machine Learninglayer 1, tier 2
Stochastic Differential Equationslayer 3, tier 2
Ito's Lemmalayer 3, tier 2
Stochastic Calculus for MLlayer 3, tier 3
Symbolic Regression and Equation Discoverylayer 4, tier 3
Sparse Recovery and Compressed Sensinglayer 4, tier 3
Q-Learninglayer 2, tier 1
Value Iteration and Policy Iterationlayer 2, tier 1
Bellman Equationslayer 2, tier 1
Stochastic Approximation Theorylayer 2, tier 2
Temporal Difference Learninglayer 2, tier 2
Actor-Critic Methodslayer 3, tier 2
Reward Systems and Reinforcement Learning Neurosciencelayer 4, tier 3
Fine-Tuning and Adaptationlayer 3, tier 1
Reinforcement Learning for Drug Discoverylayer 4, tier 3