Shards, Replicas, and The Cluster

[!NOTE] This module explores the core principles of Shards, Replicas, and The Cluster, deriving solutions from first principles and hardware constraints to build world-class, production-ready expertise.

1. The Hook: Infinite Scalability?

Relational DBs scale Vertically (Bigger machine). Elasticsearch scales Horizontally (More machines). How? By chopping your data into pieces called Shards.


2. The Hierarchy

  1. Cluster: The collection of all servers.
  2. Node: A single server (JVM process).
  3. Index: A logical namespace (e.g., users).
  4. Shard: A self-contained Search Engine (A full Lucene Instance).
  5. Segment: An immutable file on disk holding the Inverted Index.

Key Insight: An “Index” is just a logical grouping. The Shard is where the work happens.


3. Shards & Replicas

Primary Shards (The Scale)

  • You split users into 5 Primary Shards (P0, P1, P2, P3, P4).
  • Each shard holds 20% of the data.
  • You can put each shard on a different node.
  • Result: 5x Parallelism.

Replica Shards (The Safety)

  • Each Primary gets 1 Replica (R0, R1 …).
  • R0 is an exact copy of P0.
  • Purpose:
    1. High Availability: If Node 1 dies (holding P0), Node 2 (holding R0) promotes R0 \to P0. Zero downtime.
    2. Read Throughput: You can search P0 OR R0. Doubling replicas = Doubling read capacity.

4. Interactive: Cluster Visualizer

Scenario: A 3-Node Cluster with Index logs (3 Primaries, 1 Replica). Kill a node to see Self-Healing. Add a node to see Rebalancing.

Status: GREEN | Unassigned Shards: 0

5. The Hardware Reality

A Shard is not just a process. It is a directory on disk. Inside that directory are Lucene Segments.

  • Segment: A mini-inverted index.
  • Search: To search a Shard, we search every segment in parallel and merge results.
  • Merge: Background processes constantly merge small segments into big ones (to reduce file handle usage).

Latency Tip: Too many small segments = Slow Search. Force Merge is your friend (for cold data).