Beyond CAP

If you’ve studied system design for 5 minutes, you’ve heard of the CAP Theorem: Consistency, Availability, Partition Tolerance. Pick two!

But in the real world, CAP is too binary. It only describes what happens when the system is broken (during a network partition).

Staff engineers care more about what happens when the system is running normally. This is where PACELC comes in.


1. PACELC: The Real World Trade-off

PACELC extends CAP: Partitioned? (Availability vs Consistency). Else (Normal)? (Latency vs Consistency).

In a healthy system, you are always trading speed (Latency) for accuracy (Consistency).

The “Latency vs Consistency” Choice

  • Strong Consistency: Wait for all replicas to acknowledge a write.
    • Result: Safe data, but slow response times.
  • Eventual Consistency: Return a “success” response as soon as one node has the data.
    • Result: Blazing fast, but a user might see old data for a few milliseconds.

2. Interactive: Latency vs Consistency Calculator

See how replication strategy affects p99 response times.

Client
N1
N2
Calculated p99 Latency: -- ms

3. Harvest vs Yield

This is a mental model from the early days of Fox Interactive (now used at companies like Google).

  • Yield: Availability of queries (What % of requests succeeded?).
  • Harvest: Completeness of the search result (How much of the data did we see?).

The Staff Insight:

In a 1,000-node cluster, 1 node will ALWAYS be slow.

  • Low-Level Thinking: Wait for that 1 node to respond so the result is “correct”.
  • Staff-Level Thinking: If that node takes >100ms, ignore it! Give the user a 99.9% complete result now rather than a 100.0% result in 5 seconds.

[!TIP] This is called Degraded Gracefully. It is better to show “99 Friends” instead of 100 than to show a “504 Gateway Timeout” error.


4. Summary: Choosing the Right Trade-off

Scenario Priority Logic
Bank Transfer Consistency PACELC: (C) Always. I’d rather wait 5 seconds than lose money.
Twitter Like Latency PACELC: (L) Always. No one cares if a “Like” takes 500ms to show up.
Search Engine Harvest Ignore the slow nodes (Tail Latency) to keep Yield high.