Model Timeline

AI Labs Landscape

Factual reference on the major AI research labs and companies: what they build, key technical contributions, and research focus areas. Current landscape as of April 2026.

CoreTier 2Frontier~60 min

Prerequisites

Model Timeline

Why This Matters

Knowing who builds what is not gossip. It determines which papers to read, which APIs to evaluate, which open-weight models to fine-tune, and where the field's technical bets are concentrated. This page is a factual reference, not a ranking or endorsement.

Frontier Labs (Closed-Weight)

OpenAI

Definition

OpenAI

Founded: 2015 (as nonprofit), restructured 2019 (capped-profit). Headquarters: San Francisco.

Key models: GPT-2 (2019), GPT-3 (2020), GPT-4 (2023), GPT-4o (2024), o1 (2024), o3 (2025), GPT-5 (2025). Post-GPT-5 iterations through 2026 continue the o-series reasoning line and multimodal GPT family; confirm current versions at openai.com/models before citing specific numbers.

Technical contributions:

Scaling laws for neural language models (Kaplan et al., 2020)
In-context learning via large-scale pretraining (GPT-3)
RLHF pipeline for alignment (InstructGPT, Ouyang et al., 2022)
Reasoning via reinforcement learning (o-series models)
Multimodal input/output (GPT-4o: text, image, audio, video)

Research focus (2025-2026): Reasoning models (o-series), test-time compute scaling, agentic capabilities, multimodal generation. The o-series models represent a distinct approach: training models to produce extended chain-of-thought reasoning via RL, trading inference compute for accuracy.

Business model: API access, ChatGPT subscription products. Closed weights for frontier models.

Anthropic

Definition

Anthropic

Founded: 2021 by Dario Amodei, Daniela Amodei, and others (ex-OpenAI). Headquarters: San Francisco.

Key models: Claude 1 (2023), Claude 2 (2023), Claude 3 family (2024: Haiku, Sonnet, Opus), Claude 3.5 Sonnet (June 2024), Claude 3.7 Sonnet (Feb 2025, introducing extended thinking), Claude 4 family (May 2025), Claude 4.5 family and Haiku 4.5 (late 2025), Claude 4.6 family (early 2026), Claude 4.7 Opus (April 2026).

Technical contributions:

Constitutional AI (Bai et al., 2022): alignment via written principles rather than purely human preference labels
Mechanistic interpretability: circuit analysis, dictionary learning, sparse autoencoders for understanding model internals
Scaling monosemanticity (Templeton et al., 2024): extracting interpretable features from large models
Extended thinking: visible chain-of-thought reasoning in Claude 3.5+

Research focus (2025-2026): Interpretability at scale, AI safety evaluations, responsible scaling policies, alignment science. Anthropic publishes more interpretability research than any other frontier lab.

Business model: API access, Claude consumer product. Closed weights.

Google DeepMind

Definition

Google DeepMind

Founded: DeepMind (2010, acquired by Google 2014), Google Brain (2011). Merged into Google DeepMind (2023). Headquarters: London and Mountain View.

Key models: AlphaGo (2016), AlphaFold (2020, 2024), PaLM (2022), Gemini 1.0 (2023), Gemini 1.5 (2024), Gemini 2.0 (late 2024), Gemini 2.5 (2025), Gemma open models (2024-2025, including Gemma 3 and on-device Gemma Nano).

Technical contributions:

AlphaFold: solved protein structure prediction (Nobel Prize, 2024)
Transformer (Vaswani et al., 2017, at Google Brain)
Natively multimodal training (Gemini: text, image, audio, video from scratch)
Long context (Gemini 1.5: 1M+ token context window)
Mixture-of-Experts architectures at scale
Gemma: open-weight models (2B, 7B, 27B) for research

Research focus (2025-2026): Multimodal reasoning, science applications (materials, genomics), long context, efficient architectures, agentic systems. The broadest research portfolio of any AI lab, spanning pure science (AlphaFold) to consumer products (Gemini in Google services).

Business model: Integrated into Google products (Search, Workspace, Cloud). Gemini API. Gemma models released open-weight.

xAI

Definition

xAI

Founded: 2023 by Elon Musk. Headquarters: Austin, Texas.

Key models: Grok-1 (2023, 314B MoE, released open-weight), Grok-2 (2024), Grok-3 (2025).

Technical contributions: Large-scale MoE training. Grok-1 was one of the first frontier-scale models released with open weights. Grok-3 trained on one of the largest GPU clusters assembled (100K+ H100s).

Research focus (2025-2026): Scaling compute, real-time information integration (via X/Twitter data), reasoning capabilities.

Business model: Integrated into X platform, API access.

Open-Weight Labs

Meta AI (FAIR)

Definition

Meta AI / FAIR

Founded: Facebook AI Research (FAIR) in 2013. Headquarters: Menlo Park.

Key models: Llama 1 (2023), Llama 2 (2023), Llama 3 (2024), Llama 3.1 (2024, up to 405B), Llama 4 family (2025): Scout, Maverick, and Behemoth, introducing Meta's first frontier MoE models.

Technical contributions:

Llama family: established the open-weight model ecosystem
Self-supervised vision: DINO, DINOv2 (self-distillation with no labels)
JEPA (Joint Embedding Predictive Architecture): Yann LeCun's proposed alternative to autoregressive and generative approaches
Segment Anything (SAM): foundation model for image segmentation
No Language Left Behind (NLLB): multilingual translation

Research focus (2025-2026): Open-weight frontier models, self-supervised learning, world models (V-JEPA), embodied AI, multilingual coverage. Meta's strategy is to release open-weight models that become ecosystem standards, driving adoption of Meta's infrastructure.

Business model: Models released under permissive licenses. Revenue from Meta platforms (ads), not model APIs.

Mistral AI

Definition

Mistral AI

Founded: 2023. Headquarters: Paris, France.

Key models: Mistral 7B (2023), Mixtral 8x7B (2023, MoE), Mistral Large (2024), Mistral Medium, Mistral Small, Codestral (code).

Technical contributions:

Efficient open-weight models: Mistral 7B matched Llama 2 13B performance
MoE at accessible scale: Mixtral 8x7B demonstrated strong MoE results with modest hardware requirements
Sliding window attention for efficient long-context processing

Research focus (2025-2026): Efficient architectures, multilingual European models, enterprise deployment. Mistral occupies a strategic position as the leading European AI lab, with implications for EU AI regulation and data sovereignty.

Business model: Open-weight base models, commercial API, enterprise licenses for larger models.

DeepSeek

Definition

DeepSeek

Founded: 2023 as an AI subsidiary of High-Flyer (Chinese quantitative hedge fund). Headquarters: Hangzhou, China.

Key models: DeepSeek-Coder (2023), DeepSeek-V2 (2024, 236B MoE), DeepSeek-V3 (Dec 2024, 671B MoE with 37B active), DeepSeek-R1 (Jan 2025: pure-RL reasoning training without SFT), DeepSeek V3.1 and R2 (2025-2026, continued MoE scaling and improved reasoning chains).

Technical contributions:

Highly efficient MoE training at frontier scale
Auxiliary-loss-free load balancing for MoE routing
Multi-head Latent Attention (MLA) for KV-cache compression
DeepSeek-R1: demonstrated that long-chain reasoning can emerge from RL alone, without supervised reasoning traces, using rule-based rewards on math and code
Open release of full model weights and detailed technical reports, narrowing the gap between closed frontier labs and open releases

Research focus (2025-2026): Efficient pretraining, RL-driven reasoning, open-weight frontier models. DeepSeek has had outsized impact relative to compute budget by publishing detailed engineering decisions.

Business model: Open-weight models plus a commercial API with aggressively low pricing.

Alibaba (Qwen)

Definition

Alibaba Qwen

Founded: Qwen team within Alibaba Cloud, first public release 2023. Headquarters: Hangzhou, China.

Key models: Qwen (2023), Qwen2 and Qwen2.5 (2024), Qwen3 family (2025-2026: dense models 0.5B-72B plus Qwen3-MoE variants and reasoning variants like QwQ).

Technical contributions:

Strongest open-weight Chinese-English bilingual models through 2025-2026
Qwen2.5-Math and Qwen2.5-Coder: specialized reasoning-and-code variants that competed with closed models on math and programming benchmarks
Qwen3: scaled reasoning-RL recipes similar to DeepSeek-R1 with open weights across a wide parameter range

Research focus (2025-2026): Open-weight scaling, multilingual coverage, specialized reasoning and code variants, long-context extensions.

Business model: Permissive open-weight releases plus Alibaba Cloud API integration.

Infrastructure and Ecosystem

Cohere

Definition

Cohere

Founded: 2019 by Aidan Gomez (Transformer co-author), Ivan Zhang, Nick Frosst. Headquarters: Toronto.

Key models: Command family (Command, Command R, Command R+).

Technical contributions:

Retrieval-Augmented Generation (RAG) as a first-class product feature
Enterprise-focused model design: grounded generation with citations
Multilingual embeddings (Cohere Embed)

Research focus (2025-2026): Enterprise NLP, grounded generation with source attribution, domain-specific fine-tuning, efficient deployment.

Business model: Enterprise API, on-premise deployment options.

Together AI

Definition

Together AI

Founded: 2022. Headquarters: San Francisco.

Key focus: Open-source AI infrastructure. Provides APIs for running open-weight models (Llama, Mistral, etc.) at scale. Contributes to open model training and fine-tuning tooling.

Technical contributions:

FlashAttention integration and optimized inference infrastructure
Open model training runs (RedPajama dataset, Together model series)
Efficient fine-tuning APIs for open models

Research focus (2025-2026): Inference optimization, open model ecosystem, making frontier-quality open models accessible.

Business model: Cloud API for open models, enterprise infrastructure.

Hugging Face

Definition

Hugging Face

Founded: 2016. Headquarters: New York and Paris.

Key focus: The central hub for the open ML ecosystem. Hosts models, datasets, and demo applications (Spaces).

Technical contributions:

Transformers library: the standard Python interface for pretrained models
Hub: hosts 500K+ models, 100K+ datasets as of 2025
Tokenizers, Datasets, Accelerate, PEFT, TRL libraries
Democratized access to model weights, training, and evaluation

Role in 2025-2026: Hugging Face is not a model developer in the frontier sense. It is infrastructure. When Llama, Mistral, or DeepSeek release weights, they land on Hugging Face. When researchers share fine-tunes, adapters, or datasets, they go to the Hub. Understanding the Hugging Face ecosystem is a practical requirement for working with open models.

Safe Superintelligence Inc. (SSI)

Definition

Safe Superintelligence Inc. (SSI)

Founded: 2024 by Ilya Sutskever (ex-OpenAI Chief Scientist), Daniel Gross, and Daniel Levy. Headquarters: Palo Alto and Tel Aviv.

Key focus: Building safe superintelligence as a single, focused goal. No products, no API, no revenue pressure. Pure research toward superintelligent AI with safety guarantees.

Technical contributions: Not yet public as of early 2026. The significance of SSI is primarily about who is involved (Sutskever co-authored the sequence-to-sequence paper, co-led GPT-2/3/4 development, and was OpenAI's Chief Scientist) and the organizational thesis: that safety and capability research must be unified from the start, not bolted on after.

Research focus: Alignment, scalable oversight, and capability research toward superintelligence. Details are not public.

Structural Observations

The lab landscape in 2026 has several clear patterns:

Compute concentration. Frontier model training requires tens of thousands of GPUs for months. Only a handful of organizations can afford this: OpenAI (Microsoft-backed), Google DeepMind, Meta, Anthropic (Amazon/Google-backed), xAI. This creates a natural oligopoly at the frontier.

Open-weight competition. Meta, Mistral, DeepSeek, Alibaba (Qwen), and others release competitive open-weight models. This compresses the gap between closed and open models, typically to 6-12 months.

Specialization. Labs increasingly differentiate by research focus: Anthropic on interpretability and safety, Google DeepMind on science applications and multimodal, Meta on open-weight ecosystem, OpenAI on reasoning and agentic systems.

Geography. The US dominates (OpenAI, Anthropic, Meta, Google, xAI) but significant capability exists in China (DeepSeek, Qwen, Kimi), Europe (Mistral), and Canada (Cohere).

Common Confusions

Watch Out

Frontier lab does not mean best at everything

Each lab has specific strengths. Google DeepMind leads in science applications (AlphaFold). Anthropic leads in interpretability research. Meta leads in open-weight ecosystem impact. OpenAI leads in reasoning model architecture. No single lab dominates all dimensions.

Watch Out

Open-weight does not mean reproducible

When Meta releases Llama weights, you can run inference and fine-tune. You cannot reproduce the training run. The training data, data processing pipeline, RLHF preference data, and training infrastructure details are not released. "Open-weight" and "open-source" are different claims.

Summary

OpenAI: GPT series, o-series reasoning models, RLHF pipeline
Anthropic: Claude, Constitutional AI, mechanistic interpretability
Google DeepMind: Gemini (multimodal), AlphaFold (science), Gemma (open)
Meta AI: Llama (open-weight ecosystem), DINO/JEPA (self-supervised vision)
xAI: Grok, large-scale compute
Mistral: efficient European open-weight models, MoE
DeepSeek: efficient MoE, RL-trained reasoning, open frontier weights
Alibaba (Qwen): bilingual open-weight scaling, reasoning variants
Cohere: enterprise NLP, RAG, grounded generation
Together AI: open model infrastructure and inference
Hugging Face: central hub for models, datasets, and ML tooling
SSI: Sutskever-led safety-focused superintelligence research

Exercises

ExerciseCore

Problem

Name one technical contribution unique to each of the following labs: OpenAI, Anthropic, Google DeepMind, Meta AI. Do not repeat the same type of contribution for different labs.

ExerciseAdvanced

Problem

Explain the strategic difference between Meta's and OpenAI's approach to model release. What economic incentives drive each strategy? What are the implications for the research community?

References

Canonical:

Kaplan et al., "Scaling Laws for Neural Language Models" (OpenAI, 2020)
Bai et al., "Constitutional AI: Harmlessness from AI Feedback" (Anthropic, 2022)
Touvron et al., "Llama: Open and Efficient Foundation Language Models" (Meta, 2023)
Jumper et al., "Highly Accurate Protein Structure Prediction with AlphaFold" (DeepMind, 2021)

Current:

DeepSeek-AI, "DeepSeek-V3 Technical Report" (2024)
DeepSeek-AI, "DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning" (2025)
Gemini Team, "Gemini: A Family of Highly Capable Multimodal Models" (Google, 2023)
Jiang et al., "Mistral 7B" (2023)
Qwen Team, Qwen2.5 and Qwen3 technical reports (Alibaba, 2024-2026)
Meta AI, Llama 4 technical report (2025)

Next Topics

Key researchers and ideas: who did what, and what still matters
Model timeline: chronological reference for major model releases

Last reviewed: April 2026

Prerequisites

Foundations this topic depends on.

Model TimelineLayer 5

Next Topics

Key Researchers and IdeasContinue →