Prerequisite chain
Prerequisites for GPT Series Evolution
Topics you need before working through GPT Series Evolution. Direct prerequisites are listed first; transitive prerequisites (the chain reachable through them) follow.
Direct prerequisites (8)
- Transformer Architecturelayer 4, tier 2
- Attention Mechanism Theorylayer 4, tier 2
- Scaling Lawslayer 4, tier 1
- BERT and the Pretrain-Finetune Paradigmlayer 4, tier 2
- Tokenization and Information Theorylayer 4, tier 3
- Post-Training Overviewlayer 5, tier 2
- Prompt Engineering and In-Context Learninglayer 5, tier 2
- RLHF and Alignmentlayer 4, tier 2
Reachable through the chain (24)
These topics are not directly cited as prerequisites but are reached transitively by following the chain upward. Working through the direct prerequisites pulls these in.
- Matrix Operations and Propertieslayer 0A, tier 1
- Sets, Functions, and Relationslayer 0A, tier 1
- Basic Logic and Proof Techniqueslayer 0A, tier 2
- Softmax and Numerical Stabilitylayer 1, tier 1
- Feedforward Networks and Backpropagationlayer 2, tier 1
- Differentiation in Rnlayer 0A, tier 1
- Vectors, Matrices, and Linear Mapslayer 0A, tier 1
- Continuity in Rⁿlayer 0A, tier 1
- Metric Spaces, Convergence, and Completenesslayer 0A, tier 1
- Matrix Calculuslayer 1, tier 1
- The Jacobian Matrixlayer 0A, tier 1
- The Hessian Matrixlayer 0A, tier 1
- Eigenvalues and Eigenvectorslayer 0A, tier 1
- Activation Functionslayer 1, tier 1
- Convex Optimization Basicslayer 1, tier 1
- Token Prediction and Language Modelinglayer 3, tier 2
- Information Theory Foundationslayer 0B, tier 2
- Common Probability Distributionslayer 0A, tier 1
- Policy Gradient Theoremlayer 3, tier 1
- Markov Decision Processeslayer 2, tier 1
- Concentration Inequalitieslayer 1, tier 1
- Expectation, Variance, Covariance, and Momentslayer 0A, tier 1
- Random Variableslayer 0A, tier 1
- Kolmogorov Probability Axiomslayer 0A, tier 1