PACELC Theorem: The Missing Piece
Why CAP is Boring
The CAP Theorem is great, but it only talks about what happens during a Partition (P). But 99% of the time, your network is fine! Does that mean you have no choices to make? No.
The PACELC Formula
Formulated by Daniel Abadi (Yale), it states:
If there is a Partition (P), how does the system trade off Availability and Consistency (A vs C)? Else (E) (when the system is running normally), how does the system trade off Latency and Consistency (L vs C)?
P A C E L C
- PAC: If Partition, pick A or C. (Standard CAP).
- ELC: Else (Healthy), pick Latency or Consistency.
The “ELC” Trade-off
When the network is healthy, you still have to choose:
- Low Latency (L): Use Asynchronous Replication. The user writes to Master and gets an immediate “Success!”. The Master replicates in the background.
- Risk: Data Loss if Master crashes before replicating.
- High Consistency (C): Use Synchronous Replication. The user writes to Master, Master writes to Slave, Slave confirms, Master confirms to User.
- Risk: High Latency. The user waits for the round-trip to the slave.
Examples
- DynamoDB / Cassandra (PA/EL):
- PA: If partition, choose Availability.
- EL: If healthy, choose Low Latency (Async replication).
- MongoDB / HBase (PC/EC):
- PC: If partition, choose Consistency (Stop writes).
- EC: If healthy, choose Consistency (Sync replication to majority).
Interactive Demo: Tunable Consistency (Cassandra Style)
Explore how W (Write Quorum) and R (Read Quorum) affect Latency and Consistency.
Formula: If W + R > N, you have Strong Consistency but higher Latency.
Deep Dive: Healing Stale Data
In AP (Eventual Consistency) systems, data will drift. How do we fix it?
1. Read Repair (Active)
When a client reads data, it contacts multiple nodes (e.g., A, B, C).
- A says: “v1”
- B says: “v2” (Newer timestamp)
- C says: “v1” The database sees that B has newer data. It returns “v2” to the user AND simultaneously updates A and C with “v2”. The read operation triggers the repair.
2. Hinted Handoff (Temporary)
If Node A is down, Node B accepts the write on its behalf. Node B stores a “hint”: “This data belongs to A”. When A comes back online, B pushes the data to A. This ensures Availability during short outages.
3. Anti-Entropy (Background)
A background process (like Merkle Trees in Cassandra) constantly compares data between nodes and syncs differences.
Real-World System Classification
Understanding how production systems implement PACELC helps you make better design decisions.
| System | Classification | Partition Behavior | Normal Behavior | Use Case |
|---|---|---|---|---|
| DynamoDB | PA/EL | Availability (eventual consistency) | Low Latency (async replication) | Shopping carts, session storage |
| Cassandra | PA/EL | Availability (tunable) | Low Latency (configurable) | Time-series, IoT, messaging |
| MongoDB | PC/EC | Consistency (primary required) | Consistency (majority writes) | User profiles, transactions |
| HBase | PC/EC | Consistency (region unavailable) | Consistency (WAL + sync) | Analytics, large tables |
| CockroachDB | PC/EC | Consistency (Spanner-like) | Consistency (Raft consensus) | Financial data, geo-distributed |
| Redis | PC/– | Consistency (single master) | In-memory (no replication delay) | Cache, real-time leaderboards |
Deep Dive: DynamoDB Consistency
DynamoDB allows you to choose your consistency model per request, giving you fine-grained control over the “L” vs “C” trade-off.
- Eventually Consistent Read (Default)
- Mechanism: The request is routed to any of the 3 replicas. It might return data from a replica that hasn’t received the latest write yet.
- Latency: Lowest (p99 < 10ms).
- Cost: 0.5 Read Capacity Units (RCU) per 4KB.
- Use Case: User profiles, Comments, Recommendations.
- Strongly Consistent Read
- Mechanism: The request is routed to the Leader node. The Leader checks if it has the latest write (via Paxos/Raft log check) before returning.
- Latency: Higher (Network hop to Leader).
- Cost: 1.0 RCU per 4KB (Double the cost!).
- Use Case: Billing, Inventory Count, Game State.
Deep Dive: Cassandra Tunable Consistency
Cassandra lets you choose consistency per query:
Consistency Levels:
ONE: Write to 1 node, return immediately (fastest, riskiest)QUORUM: Write to majority (N/2 + 1), then return (balanced)ALL: Write to all replicas, then return (slowest, safest)
Formula: W + R > N guarantees strong consistency
- W = Write replicas
- R = Read replicas
- N = Total replicas (typically 3)
Example:
- N = 3, W = 2, R = 2 → 2 + 2 = 4 > 3 ✅ (Consistent)
- N = 3, W = 1, R = 1 → 1 + 1 = 2 < 3 ❌ (Eventual)
Trade-off Table:
| W | R | Consistency | Write Latency | Read Latency | Use Case |
|---|---|---|---|---|---|
| 1 | 1 | Eventual | Low | Low | Logs, metrics (AP/EL) |
| 2 | 2 | Strong | Medium | Medium | User data (balanced) |
| 3 | 1 | Strong | High | Low | Write-heavy, read-optimized |
| 1 | 3 | Strong | Low | High | Read-heavy, write-optimized |
Production Pattern: Use QUORUM for both reads and writes (W=2, R=2, N=3) to get strong consistency with reasonable latency.
[!TIP] Interview Insight: When asked “How would you design a globally distributed database?”, mention PACELC and explain the latency vs consistency trade-off during normal operation, not just during failures. This shows depth beyond basic CAP knowledge.