The Life of a Write: From RAM to Disk

[!NOTE] This module explores the core principles of The Life of a Write: From RAM to Disk, deriving solutions from first principles and hardware constraints to build world-class, production-ready expertise.

1. The Hook: “Near Real-Time” (NRT)

You write a document. You search for it immediately. It’s not there. 1 second later, it appears. Why? Because Disk I/O is expensive, so Elasticsearch cheats.


The Restaurant Analogy (Mnemonic)

Think of Elasticsearch as a busy restaurant kitchen:

  • Memory Buffer (The Order Pad): The waiter scribbles down your order. It’s fast, but if the waiter drops the pad (server crash), your order is lost. It’s also not yet being cooked (not searchable).
  • Translog (The Carbon Copy): As soon as the waiter writes your order, a carbon copy is instantly sent to the manager’s safe. If the waiter drops their pad, the manager can recover the order from the safe.
  • Refresh (The Kitchen Prep): Every 1 second, the kitchen takes all orders from the pad, chops the veggies, and puts them in the pan (Lucene Segment). Now the food is actually cooking (searchable).
  • Flush (Serving the Dish): Every 30 minutes, the cooked food is finally served to the customer (fsync to disk), and the carbon copies in the manager’s safe are thrown away (Translog cleared).

2. The Write Path (Step-by-Step)

Step 1: The Memory Buffer

When you POST /index/_doc/1, the document is written to the In-Memory Buffer.

  • It is NOT yet searchable.
  • It is NOT yet safe (if power fails, it’s gone).

Step 2: The Translog (Safety)

Simultaneously, the document is appended to the Translog (Transaction Log) on disk.

  • fsync: Happens every 5 seconds (by default).
  • Purpose: Recovery after crash.

Step 3: Refresh (Searchability)

Every 1 second (default), the Memory Buffer is cleared and written to a new Lucene Segment in the Filesystem Cache.

  • Now it is searchable.
  • Creating a segment is cheaper than fsync, so we do it often.

Step 4: Flush (Persistence)

Every 30 minutes (or when Translog is full), a Flush happens:

  1. All data in Filesystem Cache is fsynced to disk.
  2. Translog is cleared.

3. Visualizing the Path

Client
Memory Buffer
Segment (In Cache)
Translog (Disk)
State: Idle

4. Tuning for Performance

  • Heavy Indexing? Increase refresh_interval from 1s to 30s. You lose 30s of “real-time”, but gain massive CPU throughput (fewer segments created).
  • Data Safety? Change index.translog.durability to async for faster writes (risk losing 5s of data).