Courses / system design

Consensus (Paxos & Raft)

09 Coordination Last updated: Feb 27, 2026

Consensus (Paxos & Raft)

In the previous chapter, we elected a Leader. Now, that Leader has a job: Replicate Data.

In a distributed system, how do we get multiple nodes to agree on a sequence of values?

Problem: Nodes crash. Packets vanish. The network partitions.
Goal: All honest nodes agree on the same log of commands (Safety) and eventually decide (Liveness).

1. The Trinity: Paxos vs Raft vs ZAB

While Paxos is the academic holy grail, Raft is the engineer’s choice.

Feature	Paxos (1989)	Raft (2014)	ZAB (ZooKeeper)
Philosophy	Theoretical purity. No specific leader required.	Understandability. Strong Leader is mandatory.	Primary-Backup ordering (FIFO).
Complexity	Extreme. (Multi-Paxos is hard to implement).	Moderate. Decomposed into Election & Log Replication.	High.
Use Case	Google Spanner, Cassandra (LWT).	Kubernetes (Etcd), Consul, CockroachDB.	Kafka (Old Controller), Hadoop.

[!TIP] Interview Strategy: Explain Raft. It has a linear narrative: Leader Election → Log Replication → Commit. Paxos is a mesh of proposers and acceptors that is easy to get wrong.

2. Raft Log Replication

Once a Leader is elected, it handles all Client requests. The goal: Replicate SET X=5 to a majority of followers.

The Flow Diagram

Leader

Followers (Quorum)

Client

1. SET X=5

Log: [X=5] (Uncommitted)

2. AppendEntries RPC

Log: [X=5] (Uncommitted)

3. Success (ACK)

Log: [X=5] (COMMITTED)

3. Interactive Demo: The Replication Loop

Cyberpunk Mode: Visualize the commit flow.

Mission: Replicate SET X=99 to the cluster.
Obstacle: Network Partition.
Visuals: Watch the logs turn from Yellow (Uncommitted) to Green (Committed).

[!TIP] Try it yourself:

Click “🚀 Send SET X=99”. Watch the Leader replicate to Followers, get ACKs, and Commit.

Click “✂️ Network” to cut the connection.

Try sending again. Watch the Leader retry endlessly because it can’t reach a Quorum.

System Idle. Waiting for Client.

LEADER

Empty

Commit Idx: 0

Follower 1

Empty

Commit Idx: 0

Follower 2

Empty

Commit Idx: 0

4. Deep Dive: Linearizability vs Serializability

This is the most confusing topic in Distributed Systems.

Linearizability (Strong Consistency): A real-time guarantee. If Operation A completes at time T, any Operation B that starts after T MUST see A. It makes the system behave like a Single Copy of Data. Raft provides this via Quorum reads.
Serializability (Isolation): A transaction guarantee. If Transactions A and B run concurrently, the result must be as if they ran one after the other (A then B, or B then A). It does not care about real-time.

[!TIP] Raft provides Linearizability. Databases (like PostgreSQL) provide Serializability. Spanner provides BOTH (Strict Serializability) using TrueTime.

5. Summary

Leader orders the log.
Followers replicate the log.
Commit happens only when a Majority acknowledges.
Linearizability is guaranteed by reading from the Leader (who checks with a quorum).

Previous Lesson ← Distributed Locking (Redlock) Next Lesson ZooKeeper Basics →