CAP Theorem: The “Pick Two” Dilemma
The Theorem
In a Distributed Data Store, you can only guarantee two of the following three properties:
- Consistency (Linearizability): Every read receives the most recent write or an error. (Note: This is stricter than ACID Consistency).
- Availability: Every request receives a (non-error) response, without the guarantee that it contains the most recent write.
- Partition Tolerance: The system continues to operate despite an arbitrary number of messages being dropped or delayed by the Network between nodes.
The “CA” Myth
You will often hear: “Pick CA, CP, or AP.” This is misleading. In a distributed system over a network (like the Internet), Partitions (P) are inevitable. Cables get cut, routers crash, AWS regions go down. You MUST pick P. So your real choice is: CP vs AP.
- CP (Consistency > Availability): “If the network breaks, shut down the system so we don’t return old data.” (e.g., Banking, Inventory).
- AP (Availability > Consistency): “If the network breaks, keep serving data, even if it’s slightly old.” (e.g., Twitter/X Timeline, Reddit Comments).
Deep Dive: Google Spanner & TrueTime
Google Spanner claims to be a CA system. How? It uses TrueTime (Atomic Clocks + GPS) to synchronize clocks across data centers with tiny error margins (< 7ms). Because it knows exactly when a transaction happened, it can ensure Consistency without sacrificing Availability in practice. However, technically, if a massive partition happens that exceeds the TrueTime error margin, Spanner chooses Consistency (it pauses), making it effectively CP. But the “pause” is so rare (99.999% availability), it feels like CA.
Interactive Demo: The Split Brain Simulator
Control the Network. See the trade-off.
- Write Data: Update the value on Node A (USA).
- Cut the Network: Create a Partition.
- Read Data: Try to read from Node B (Asia).
- CP Mode: Node B says “I can’t talk to A, so I won’t answer.” (Error 503).
- AP Mode: Node B says “I can’t talk to A, but here is what I have.” (Stale Data).
Decision Matrix: When to pick which?
| Scenario | Choice | Why? |
|---|---|---|
| Banking / ATM | CP | You cannot allow a user to withdraw money they don’t have. Show an error instead of a wrong balance. |
| Social Media Feed | AP | It’s okay if a user sees a post 5 seconds late. It’s NOT okay if the feed is blank (Error). |
| Shopping Cart | AP | Never stop a user from adding items. Merge the cart later (Dynamo style). |
| Ticket Booking | CP | You cannot sell the same seat twice. Locking is required. |
Summary Comparison
| Feature | CP (Consistency-First) | AP (Availability-First) |
|---|---|---|
| Philosophy | “Better to fail than to lie.” | “Better to give a wrong answer than no answer.” |
| Behavior during P | Returns Error (503). | Returns Stale Data. |
| Ideal For | Banking, Billing, Inventory. | Social Feeds, Comments, Likes. |
| Examples | MongoDB, HBase, Redis. | Cassandra, DynamoDB, CouchDB. |
Next Step: PACELC
You might ask: “What happens when there is NO Partition?” CAP is silent on this. That’s why we have the PACELC Theorem, which extends CAP to handle the trade-off between Latency and Consistency when the network is healthy. Read about PACELC