Etcd: The Single Source of Truth

[!WARNING] Backup Etcd! If you lose your Etcd data, you lose your entire cluster. The API Server is stateless. The Scheduler is stateless. Only Etcd holds the state.

1. What is Etcd?

Etcd is a distributed, consistent key-value store.

  • Distributed: It runs on multiple servers (nodes) for high availability.
  • Consistent: Every read returns the latest write. It prioritizes Consistency over Availability (CP system in CAP theorem).
  • Key-Value: Data is stored as simple keys (like /registry/pods/default/mypod) and values (JSON/Protobuf).

2. Why Etcd?

Why not MySQL or Postgres?

  1. Watch Mechanism: Kubernetes controllers rely on being notified immediately when a change happens. Etcd is designed for this “watch” pattern.
  2. Simple Data Model: Kubernetes resources (Pods, Services) are hierarchical documents. This maps naturally to a key-value store with a directory-like structure.
  3. Speed: Optimized for fast writes and very fast reads of small keys.

3. How Etcd Works: The Raft Consensus Algorithm

In a distributed system, how do you ensure all nodes agree on the data? Etcd uses Raft.

The Problem: Split Brain

Imagine you have 3 Etcd nodes. If the network splits, you don’t want two different leaders accepting conflicting writes.

The Solution: Quorum

Raft requires a Quorum (majority) to commit a write.

  • 3 Nodes: Need 2 to agree. Can survive 1 failure.
  • 5 Nodes: Need 3 to agree. Can survive 2 failures.
  • Even Numbers?: Bad idea. With 4 nodes, if the network splits 2 vs 2, neither side has a majority (3). The cluster stops accepting writes.

Leader Election

One node is elected Leader. All writes MUST go to the Leader. The Leader replicates the data to the Followers. Once a majority acknowledge the write, the Leader commits it.


4. Interactive: Raft Consensus Visualizer

Simulate a 3-node Etcd cluster. See how writes are replicated.

LEADER
Log: []
FOLLOWER
Log: []
FOLLOWER
Log: []
Cluster healthy. Leader is Node 1.

5. Summary

Etcd is the only stateful part of the Control Plane. It uses the Raft algorithm to ensure that your cluster configuration is safe, consistent, and available even if a machine fails.

In the next chapter, we explore the Declarative Model, which relies entirely on Etcd’s ability to store the “Desired State”.