BASE & Eventual Consistency: The Speed Trade-off

If ACID is the “Strict Bank Teller”, BASE is the “Social Media Feed”. It doesn’t matter if your friend sees your post 2 seconds later than you do. What matters is that the app loads instantly.

1. The Philosophy: Optimism

  • ACID (SQL): Pessimistic. “I will lock this row until I am 100% sure the data is safe everywhere.”
  • BASE (NoSQL): Optimistic. “I will accept this write immediately and figure out the synchronization details later.”

The Acronym

  • BA (Basically Available): The system guarantees availability. If a node goes down, the system still replies (possibly with stale data).
  • S (Soft State): The state of the system may change over time, even without new input (due to background syncing).
  • E (Eventual Consistency): If writes stop, all replicas will eventually agree on the same value.

2. Distributed Consistency Mechanics

How do we actually achieve this? We use Replication.

Quorums (N, R, W)

In a distributed system (like Cassandra or DynamoDB), we don’t just write to one hard drive. We write to N replicas.

  • N: Replication Factor (Total copies). Usually 3.
  • W: Write Quorum. How many nodes must confirm the write before we say “Success” to the user?
  • R: Read Quorum. How many nodes must we ask to get data?

The Golden Rule: R + W > N If this equation holds, you are guaranteed Strong Consistency (you will always read the latest write). If R + W <= N, you risk Eventual Consistency (reading old data).

Anti-Entropy (Fixing the Mess)

Since nodes can be out of sync, how do they agree?

  1. Read Repair: When you read data, the DB asks multiple nodes. If Node A says “v1” and Node B says “v2” (newer), the DB returns “v2” and silently updates Node A.
  2. Hinted Handoff: If a node is down, the DB writes the data to a neighbor with a note: “Give this to Node A when it comes back online.”
  3. Merkle Trees: Background processes that compare massive data structures to find differences efficiently without sending the whole dataset.

3. Interactive Demo: Quorum Configurator (N, R, W)

Visualize how tuning Read (R) and Write (W) quorums affects consistency.

  • Strong Consistency: When R + W > N, a read is guaranteed to see the latest write.
  • Eventual Consistency: When R + W <= N, you might read stale data.
Strong Consistency (R + W > N)
System Ready. Adjust sliders or perform actions.

4. Deep Dive: Conflict Resolution (Vector Clocks)

What happens if two users update the same data at the exact same time on different nodes? User A adds “Apple” to cart. User B adds “Banana” to cart (same account). Who wins?

A. Last Write Wins (LWW)

The lazy approach. We look at the timestamp.

  • User A: 12:00:01 PM
  • User B: 12:00:02 PM
  • Winner: User B.
  • Result: Cart has “Banana”. “Apple” is lost. Data Loss!

B. Vector Clocks (Amazon Dynamo Style)

The smart approach. We track causality, not just wall-clock time. Every piece of data carries a version history: [NodeA: 1, NodeB: 0].

The Shopping Cart Scenario:

  1. Initial: [A:0, B:0] Cart: {}
  2. Node A adds Apple: [A:1, B:0] Cart: {Apple}
  3. Node B adds Banana: [A:0, B:1] Cart: {Banana} (Note: Node B hasn’t seen Node A’s update yet).
  4. Sync: The DB compares [1, 0] and [0, 1].
    • Is 1 > 0 AND 0 > 1? No.
    • Neither version is “newer”. This is a Conflict.
  5. Resolution: The DB saves both versions: {Apple} AND {Banana}.
  6. Client Repair: The next time the user reads the cart, the app gets both versions. It merges them to {Apple, Banana} and writes back [A:1, B:1].

5. Case Study: Social Media Likes

Let’s apply BASE to a real feature: Instagram Likes.

Requirement

  • Millions of users liking posts simultaneously.
  • Latency: Must be < 100ms.
  • Consistency: If I like a post, it’s okay if my friend in Japan sees the count update 5 seconds later.

Architecture

  1. Write Path:
    • User taps “Like”.
    • App sends request to closest Edge Server.
    • Server writes to local Redis/Cassandra node.
    • Returns “Success” immediately. (Basically Available).
  2. Background Sync:
    • The local node asynchronously pushes the update to other data centers. (Eventual Consistency).
  3. Read Path:
    • Users read the Like Count from their local replica.
    • The count might be 100 in US and 95 in EU. This is Soft State.

Conflict Resolution

What if two people like at the exact same millisecond?

  • Actually, “Likes” are Commutative.
  • Count = Count + 1.
  • Order doesn’t matter. 1 + 1 = 2.
  • We use CRDTs (Conflict-free Replicated Data Types) to merge counters automatically without Vector Clocks.

6. Summary

  • BASE prioritizes Availability over immediate Consistency.
  • Quorums (R + W > N) allow you to tune the trade-off.
  • Conflict Resolution: Use LWW for simplicity, Vector Clocks for correctness in complex merges (Shopping Carts).
  • Anti-Entropy: Merkle Trees and Read Repair keep nodes in sync eventually.