Prerequisite chain
Prerequisites for Distributed Training Theory
Topics you need before working through Distributed Training Theory. Direct prerequisites are listed first; transitive prerequisites (the chain reachable through them) follow.
Direct prerequisites (2)
- Optimizer Theory: SGD, Adam, and Muonlayer 3, tier 1
- Parallel Processing Fundamentalslayer 5, tier 2
Reachable through the chain (16)
These topics are not directly cited as prerequisites but are reached transitively by following the chain upward. Working through the direct prerequisites pulls these in.
- Convex Optimization Basicslayer 1, tier 1
- Differentiation in Rnlayer 0A, tier 1
- Sets, Functions, and Relationslayer 0A, tier 1
- Basic Logic and Proof Techniqueslayer 0A, tier 2
- Vectors, Matrices, and Linear Mapslayer 0A, tier 1
- Continuity in Rⁿlayer 0A, tier 1
- Metric Spaces, Convergence, and Completenesslayer 0A, tier 1
- Matrix Operations and Propertieslayer 0A, tier 1
- Adam Optimizerlayer 2, tier 1
- Gradient Descent Variantslayer 1, tier 1
- Stochastic Gradient Descent Convergencelayer 2, tier 1
- Concentration Inequalitieslayer 1, tier 1
- Common Probability Distributionslayer 0A, tier 1
- Expectation, Variance, Covariance, and Momentslayer 0A, tier 1
- Random Variableslayer 0A, tier 1
- Kolmogorov Probability Axiomslayer 0A, tier 1