Consensus (Paxos & Raft)

In the previous chapter, we elected a Leader. Now, that Leader has a job: Replicate Data.

In a distributed system, how do we get multiple nodes to agree on a sequence of values?

  • Problem: Nodes crash. Packets vanish. The network partitions.
  • Goal: All honest nodes agree on the same log of commands (Safety) and eventually decide (Liveness).

1. The Trinity: Paxos vs Raft vs ZAB

While Paxos is the academic holy grail, Raft is the engineer’s choice.

Feature Paxos (1989) Raft (2014) ZAB (ZooKeeper)
Philosophy Theoretical purity. No specific leader required. Understandability. Strong Leader is mandatory. Primary-Backup ordering (FIFO).
Complexity Extreme. (Multi-Paxos is hard to implement). Moderate. Decomposed into Election & Log Replication. High.
Use Case Google Spanner, Cassandra (LWT). Kubernetes (Etcd), Consul, CockroachDB. Kafka (Old Controller), Hadoop.

[!TIP] Interview Strategy: Explain Raft. It has a linear narrative: Leader ElectionLog ReplicationCommit. Paxos is a mesh of proposers and acceptors that is easy to get wrong.


2. Raft Log Replication

Once a Leader is elected, it handles all Client requests. The goal: Replicate SET X=5 to a majority of followers.

The Flow Diagram

Leader
Followers (Quorum)
Client
1. SET X=5
Log: [X=5] (Uncommitted)
2. AppendEntries RPC
Log: [X=5] (Uncommitted)
3. Success (ACK)
Log: [X=5] (COMMITTED)

3. Interactive Demo: The Replication Loop

Cyberpunk Mode: Visualize the commit flow.

  • Mission: Replicate SET X=99 to the cluster.
  • Obstacle: Network Partition.
  • Visuals: Watch the logs turn from Yellow (Uncommitted) to Green (Committed).

[!TIP] Try it yourself:

  1. Click “🚀 Send SET X=99”. Watch the Leader replicate to Followers, get ACKs, and Commit.
  2. Click “✂️ Network” to cut the connection.
  3. Try sending again. Watch the Leader retry endlessly because it can’t reach a Quorum.
System Idle. Waiting for Client.
LEADER
Empty
Commit Idx: 0
Follower 1
Empty
Commit Idx: 0
Follower 2
Empty
Commit Idx: 0

4. Deep Dive: Linearizability vs Serializability

This is the most confusing topic in Distributed Systems.

  • Linearizability (Strong Consistency): A real-time guarantee. If Operation A completes at time T, any Operation B that starts after T MUST see A. It makes the system behave like a Single Copy of Data. Raft provides this via Quorum reads.
  • Serializability (Isolation): A transaction guarantee. If Transactions A and B run concurrently, the result must be as if they ran one after the other (A then B, or B then A). It does not care about real-time.

[!TIP] Raft provides Linearizability. Databases (like PostgreSQL) provide Serializability. Spanner provides BOTH (Strict Serializability) using TrueTime.

5. Summary

  • Leader orders the log.
  • Followers replicate the log.
  • Commit happens only when a Majority acknowledges.
  • Linearizability is guaranteed by reading from the Leader (who checks with a quorum).