
“There are only two hard things in Computer Science…”
…Naming things and Cache Invalidation.
Most tutorials teach you basic Cache-Aside:
- Read Miss? Fetch from DB → Write to Cache.
- Write? Write to DB → Delete from Cache.
At scale, this simple logic falls apart. It introduces Race Conditions that can leave your cache permanently inconsistent with your database, and Thundering Herds that can take down your DB when keys expire.
1. The Invalidation Race Condition
Imagine two processes, Reader A and Writer B, operating on the same key user:101.
- Reader A reads DB:
v1(Cache Miss). - Writer B updates DB to
v2. - Writer B deletes Cache (Invalidate).
- Reader A writes
v1to Cache (stale set).
Now your cache has v1 (stale) and your DB has v2 (fresh). Because Reader A’s network call took longer, the cache is poisoned indefinitely until the TTL expires.
The Fix: Leases (e.g., Facebook’s Memcached)
A Lease is a 64-bit token given to a client when they experience a cache miss. checking the token on set ensures ordering.
- Reader A gets Miss + Lease 1 from Cache.
- Writer B updates DB + Invalidates Cache (this cancels Lease 1).
- Reader A tries to SET with Lease 1.
- Cache rejects it (Lease 1 is invalid).
2. Thundering Herds & Stampedes
The second major issue Leases solve is the Thundering Herd.
Without leases, if a hot key expires, 10,000 requests all miss and all hit the DB. With Leases, the cache returns a “Hot Miss” (or a special token) to the first requester.
- Requester 1: Gets the Lease. Goes to DB.
- Requesters 2-9999: See that a lease is active. They Wait (spin-lock) or use the Stale Value (if configured) until Requester 1 fills the cache.
This reduces DB load from 10,000 queries to 1 query.
3. Reliable Invalidation: Change Data Capture (CDC)
Relying on your application code (e.g., after_save hooks) to invalidate cache is brittle. What if the server crashes after the DB write but before the Cache delete?
The Transaction Log Pattern
Leading companies (Airbnb, Facebook, DoorDash) decouple invalidation using the Database Transaction Log (WAL).
- App writes to DB (Commit).
- DB writes to Replication Log (Binlog/WAL).
- CDC Service (Debezium/Maxwell) tails the log.
- CDC Service sees the change and invalidates the cache.
This guarantees that if the data is in the DB, the cache WILL be invalidated, eventually.
Summary of Patterns
| Pattern | Pro | Con | Use Case |
|---|---|---|---|
| Simple Cache-Aside | Easy to implement | Race conditions, Thundering Herds | Low scale, non-critical data |
| Leases (McLeuce) | Solves Races & Herds | Client complexity | High-scale read-heavy (Facebook) |
| CDC Invalidation | Reliable consistency | Infrastructure complexity (Kafka/Debezium) | Mission-critical consistency (Airbnb) |