Where this topic leads
Topics that build on Transformer Architecture
Once you have Transformer Architecture, these are the topics that cite it as a prerequisite. Pick by tier and the area you want to push into next.
Editor's suggested next (3)
Core flagship topics (7)
- Attention Is All You Need (Paper)layer 4 · llm-construction
- Chain-of-Thought and Reasoninglayer 5 · llm-construction
- DeepSeek Modelslayer 5 · model-timeline
- Hallucination Theorylayer 4 · llm-construction
- Mechanistic Interpretability: Features, Circuits, and Causal Faithfulnesslayer 4 · ai-safety
- Tabular Foundation Models as Bayesian Inference Engineslayer 3 · bayesian-ml-frontier
- Vision Transformer Lineage: ViT, DeiT, Swin, MAE, DINOv2, SAMlayer 4 · beyond-llms
Standard topics (18)
- BERT and the Pretrain-Finetune Paradigmlayer 4 · llm-construction
- Claude Model Familylayer 5 · model-timeline
- Cohere Modelslayer 4 · model-timeline
- Forgetting Transformer (FoX)layer 4 · llm-construction
- Gemini and Google Modelslayer 5 · model-timeline
- GPT Series Evolutionlayer 5 · model-timeline
- Induction Headslayer 4 · llm-construction
- LLaMA and Open Weight Modelslayer 5 · model-timeline
- Mistral Modelslayer 4 · model-timeline
- Mixture of Expertslayer 4 · llm-construction
- Model Comparison Tablelayer 5 · model-timeline
- Multi-Token Predictionlayer 5 · llm-construction
- Post-Training Overviewlayer 5 · llm-construction
- Prompt Engineering and In-Context Learninglayer 5 · llm-construction
- Residual Stream and Transformer Internalslayer 4 · llm-construction
- Speculative Decoding and Quantizationlayer 5 · llm-construction
- Structured Output and Constrained Generationlayer 5 · llm-construction
- Tool-Augmented Reasoninglayer 5 · llm-construction
Advanced or specialty topics (6)
- Attention for Protein Structure: AlphaFold and Successorslayer 4 · applied-ml
- Audio Language Modelslayer 5 · beyond-llms
- Donut and OCR-Free Document Understandinglayer 5 · llm-construction
- Model Merging and Weight Averaginglayer 5 · llm-construction
- Plan-then-Generatelayer 5 · llm-construction
- Qwen and Chinese Modelslayer 5 · model-timeline