Beta. Content is under active construction and has not been peer-reviewed. Report errors on GitHub.Disclaimer

Systems

CAP Theorem

In a distributed system, you cannot simultaneously guarantee consistency, availability, and partition tolerance. Brewer's conjecture, the Gilbert-Lynch proof, PACELC, and practical implications for ML infrastructure.

CoreTier 2Stable~35 min
0

Why This Matters

Any distributed system that stores state (a database, a parameter server, a feature store, a model registry) must choose how to behave when network partitions occur. The CAP theorem proves that no system can simultaneously provide strong consistency, high availability, and partition tolerance. This is not a design limitation; it is a mathematical impossibility result.

For ML practitioners, CAP matters because modern ML infrastructure is distributed. Training clusters use parameter servers or all-reduce protocols. Feature stores serve real-time features across data centers. Model serving systems replicate models for low latency. In each case, the system designer must decide: when a network partition occurs, do you serve stale data (sacrifice consistency) or refuse to serve (sacrifice availability)?

Formal Definitions

The CAP theorem requires precise definitions of its three properties. Informal statements ("pick two of three") obscure the actual result.

Definition

Consistency (Linearizability)

A distributed system provides consistency (in the CAP sense) if it is linearizable: every read operation returns the value of the most recent completed write, as if all operations were executed on a single copy of the data in some sequential order consistent with real-time ordering.

Formally: there exists a total order on all operations such that (1) the order is consistent with real-time precedence (if operation AA completes before operation BB starts, then AA precedes BB in the order), and (2) every read of object XX returns the value written by the most recent preceding write to XX in this order.

This is the strongest standard consistency model. It is strictly stronger than sequential consistency, causal consistency, or eventual consistency.

Definition

Availability

A distributed system provides availability if every request received by a non-failing node eventually receives a response, without any guarantee about the response time bound.

In the Gilbert-Lynch formalization: every request to a non-failing node must result in a response. A system that responds with an error message or timeout is not available under this definition. A system that queues requests indefinitely is not available either.

Definition

Partition Tolerance

A distributed system provides partition tolerance if it continues to operate correctly (satisfying its consistency and availability guarantees) despite arbitrary message loss or delay between nodes.

A network partition divides the nodes into two or more groups that cannot communicate with each other. Partition tolerance means the system must handle this scenario; it cannot assume the network is reliable.

Watch Out

Partition tolerance is not optional in real systems

In a distributed system, network partitions will occur. Hardware fails, switches malfunction, cables get cut, cloud availability zones lose connectivity. A system that does not tolerate partitions is a system that runs on a single node (or assumes a perfectly reliable network). The practical version of CAP is not "pick 2 of 3" but rather "given that P is required, choose between C and A during partitions."

The Impossibility Result

Theorem

CAP Impossibility Theorem

Statement

In an asynchronous distributed system, it is impossible to implement a read/write data object that simultaneously guarantees all three of the following properties:

  1. Consistency (linearizability)
  2. Availability (every request to a non-failing node receives a response)
  3. Partition tolerance (the system operates correctly despite arbitrary network partitions)

Any distributed system providing partition tolerance must sacrifice either consistency or availability during a network partition.

Intuition

Suppose two nodes N1N_1 and N2N_2 each hold a copy of variable XX, initially X=v0X = v_0. A network partition separates them. A client writes X=v1X = v_1 to N1N_1. Another client reads XX from N2N_2.

Node N2N_2 cannot communicate with N1N_1 (partition), so it does not know about the write. It has two choices:

  • Return the stale value v0v_0. This violates consistency (the read should return v1v_1).
  • Refuse to respond until the partition heals. This violates availability.

There is no third option. The partition makes it impossible to both return the correct value and return any value at all.

Proof Sketch

(Following Gilbert and Lynch, 2002.)

Assume for contradiction that a system provides all three properties: consistency (linearizability), availability, and partition tolerance.

Consider two nodes N1N_1 and N2N_2 connected by a network that may partition. Initialize the shared variable X=v0X = v_0 at both nodes.

Step 1. Partition the network so that N1N_1 and N2N_2 cannot communicate.

Step 2. Client AA sends a write request Xv1X \leftarrow v_1 to N1N_1. By availability, N1N_1 must acknowledge the write. Since the partition prevents communication, N2N_2 does not learn about this write.

Step 3. Client BB sends a read request for XX to N2N_2. By availability, N2N_2 must respond. The only value N2N_2 knows is v0v_0 (the partition prevents it from learning v1v_1). So N2N_2 returns v0v_0.

Step 4. The write of v1v_1 completed before the read started (in real time). By linearizability, the read must return v1v_1. But N2N_2 returned v0v_0. Contradiction.

Therefore, no system can provide all three properties simultaneously.

For the partially synchronous model (bounded message delay), Gilbert and Lynch show a similar result: if the partition lasts longer than the message delay bound, the same impossibility holds.

Why It Matters

The CAP theorem is one of the few impossibility results in distributed systems with direct engineering consequences. It forces every distributed storage system to make an explicit choice:

  • CP systems (consistency + partition tolerance): during a partition, some requests are rejected. Examples: ZooKeeper, etcd, HBase in strict mode. These systems are suitable when correctness matters more than uptime (configuration stores, lock services, leader election).

  • AP systems (availability + partition tolerance): during a partition, all nodes respond but may return stale data. Examples: Cassandra, DynamoDB (default settings), CouchDB. These systems are suitable when uptime matters more than perfect consistency (shopping carts, social media feeds, caching layers).

  • CA systems (consistency + availability, no partition tolerance): only possible on a single node or with a perfectly reliable network. Traditional single-node relational databases are CA. In practice, any system deployed across a network must handle partitions, so CA is not achievable in a distributed setting.

Failure Mode

The theorem says nothing about what happens when there is no partition. During normal operation (no partition), a system can provide both consistency and availability. The trade-off only activates during partitions. Many practitioners incorrectly interpret CAP as "you can only ever have two properties," when the correct reading is "during a partition, you must choose between C and A."

The theorem also uses the strongest form of consistency (linearizability). Weaker consistency models (causal consistency, eventual consistency) are achievable alongside availability and partition tolerance. The theorem does not say all consistency is impossible under partitions; it says the strongest form is.

The PACELC Extension

CAP only describes behavior during partitions. Abadi (2012) proposed PACELC to also characterize the trade-off during normal operation.

Definition

PACELC

The PACELC framework extends CAP:

  • If there is a Partition, the system trades off between Availability and Consistency (this is the CAP trade-off).
  • Else (during normal operation, no partition), the system trades off between Latency and Consistency.

Even without partitions, enforcing strong consistency requires coordination between replicas (synchronous replication, consensus protocols), which adds latency. A system can serve reads from local replicas with low latency but risk staleness, or coordinate with the primary to ensure freshness at the cost of higher latency.

PACELC classifications of common systems:

SystemDuring PartitionNormal Operation
ZooKeeperPC (refuse some requests)EC (pay latency for consistency)
CassandraPA (serve stale data)EL (low latency, eventual consistency)
DynamoDB (default)PAEL
SpannerPCEC (uses TrueTime for consistency)
PostgreSQL (single-node)CA (no partition tolerance)EC

Practical Trade-offs

Eventual Consistency

Most AP systems provide eventual consistency: if no new writes occur, all replicas will eventually converge to the same value. This is formalized as:

Definition

Eventual Consistency

A replicated data store is eventually consistent if, whenever no new updates are made to a given data item, all replicas of that item will eventually return the same value. There is no bound on how long convergence takes.

Formally: for every execution, if there exists a time TT after which no writes occur, then there exists a time TTT' \geq T after which all reads return the same value.

Eventual consistency is weak. It says nothing about what happens during the convergence period. Reads may return arbitrarily old values, different replicas may return different values simultaneously, and there is no bound on convergence time. Stronger models (causal consistency, read-your-writes, monotonic reads) provide better guarantees while remaining achievable under partitions.

Consensus Protocols

CP systems typically use distributed consensus protocols (Paxos, Raft, ZAB) to maintain consistency across replicas. These protocols guarantee that a majority of nodes agree on every write before it is committed. During a partition, the minority partition cannot form a majority and therefore cannot serve writes (sacrificing availability for the minority side).

The cost: every write requires communication with a majority of replicas. For a system with 2f+12f + 1 replicas tolerating ff failures, each write requires f+1f + 1 acknowledgments. This adds latency proportional to the network round-trip time to the slowest responding majority node.

Relevance to ML Infrastructure

Distributed training. AllReduce-based training requires all workers to synchronize gradients at each step. This is a CP-style design: if any worker becomes unreachable (partition), training stalls (sacrifices availability). Asynchronous SGD is an AP-style design: workers use stale gradients (sacrifices consistency) but training continues. The Hogwild! result shows that for sparse problems, asynchronous updates with stale gradients still converge, but the convergence rate degrades with staleness.

Feature stores. A feature store serving real-time features for model inference faces CAP directly. If the feature store is partitioned from its source of truth, it can serve cached (stale) features (AP) or return errors (CP). Most production feature stores choose AP with bounded staleness: serve cached features if the cache is less than TT seconds old, fail otherwise.

Model serving. Replicated model servers are typically AP: each replica serves its local copy of the model. During a model update, some replicas may briefly serve the old version while others serve the new version. This is usually acceptable because model updates are infrequent and the old model is not "wrong," just slightly outdated.

Parameter servers. The original parameter server architecture (Li et al., 2014) supports both synchronous (CP-like) and asynchronous (AP-like) updates. The choice depends on the model: large sparse models (recommendation systems) tolerate asynchronous updates well; dense models (language models) are more sensitive to staleness.

Common Confusions

Watch Out

CAP does not mean pick any two and get a working system

The "pick 2 of 3" framing suggests three equally viable combinations. In practice, partition tolerance is required for any distributed system (partitions happen). The real choice is between CP and AP during partitions. A "CA" system is a single-node system or one that crashes on partition, which is not useful for distributed deployments.

Watch Out

CAP says nothing about performance

The theorem is about possibility, not performance. A CP system might have terrible latency even without partitions (due to consensus overhead). An AP system might have poor consistency even without partitions (due to replication lag). CAP describes the fundamental trade-off boundary, not the engineering quality of a specific system. PACELC addresses this gap.

Watch Out

Eventual consistency is not the same as no consistency

An eventually consistent system guarantees convergence in the absence of new writes. This is much stronger than "anything goes." Conflict resolution mechanisms (last-writer-wins, vector clocks, CRDTs) ensure that convergence happens deterministically. The weakness is that there is no time bound on convergence and no guarantee about intermediate states.

Key Takeaways

  • CAP theorem: no distributed system can provide linearizability, availability, and partition tolerance simultaneously
  • The proof is by contradiction: during a partition, a node must either return stale data (violating C) or not respond (violating A)
  • In practice, partition tolerance is required, so the choice is CP or AP during partitions
  • PACELC extends CAP to also characterize latency vs. consistency trade-offs during normal operation
  • Eventual consistency guarantees convergence but provides no time bound or intermediate-state guarantees
  • CP systems (ZooKeeper, etcd) use consensus protocols; AP systems (Cassandra, DynamoDB) use eventual consistency
  • ML infrastructure (parameter servers, feature stores, model serving) faces these trade-offs directly

Exercises

ExerciseCore

Problem

A distributed key-value store has three nodes: N1N_1, N2N_2, N3N_3. A network partition separates {N1}\{N_1\} from {N2,N3}\{N_2, N_3\}. The system uses majority quorum for writes (a write succeeds if acknowledged by at least 2 of 3 nodes).

(a) Can a client connected to N1N_1 successfully write a value? Why or why not? (b) Can a client connected to N2N_2 successfully write a value? Why or why not? (c) Is this system CP or AP? Explain.

ExerciseCore

Problem

Classify each of the following systems as CP, AP, or CA under the PACELC framework, and state the normal-operation trade-off (EL or EC):

(a) A single PostgreSQL server with no replication. (b) A Cassandra cluster with replication factor 3 and consistency level ONE (reads and writes require only 1 replica). (c) A ZooKeeper ensemble with 5 nodes.

ExerciseAdvanced

Problem

The CAP theorem assumes the strongest consistency model (linearizability). Show that if we weaken consistency to causal consistency, the impossibility disappears: it is possible to build a system that is causally consistent, available, and partition-tolerant.

Describe the key idea of how such a system works (you do not need a full protocol, but describe the mechanism that avoids the CAP impossibility argument).

References

Canonical:

  • Brewer, "Towards Robust Distributed Systems" (PODC 2000 keynote), the original CAP conjecture
  • Gilbert & Lynch, "Brewer's Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services" (2002), the formal proof, Sections 2-4
  • Gilbert & Lynch, "Perspectives on the CAP Theorem" (2012), clarifications and extensions

Extensions and Corrections:

  • Abadi, "Consistency Tradeoffs in Modern Distributed Database System Design" (2012), the PACELC framework
  • Brewer, "CAP Twelve Years Later: How the 'Rules' Have Changed" (2012), Brewer's own retrospective and corrections to common misinterpretations

Distributed Systems Context:

  • Kleppmann, Designing Data-Intensive Applications (2017), Chapter 9, consistency and consensus with practical examples
  • Lynch, Distributed Algorithms (1996), Chapters 13, 17, formal models of distributed computing and impossibility results

Next Topics

Last reviewed: April 2026

Prerequisites

Foundations this topic depends on.