Distributed Locking: The Race
In the monolithic age, a simple Mutex (synchronized) kept order. But in the distributed future, with 50 microservices fighting for the same resource, memory locks are useless.
You need a Distributed Lock. A traffic light for your cluster.
1. The Anomaly: Double Booking
Imagine a Ticketmaster clone.
- User A checks: “Seat 1A Available?” → YES.
- User B checks: “Seat 1A Available?” → YES.
- User A books Seat 1A.
- User B books Seat 1A. Result: Collision. Data Corruption. Angry Users.
The Solution: Mutual Exclusion.
- User A Acquires Lock for
seat_1A. - User B tries to acquire Lock → FAILS (Wait).
- User A books seat → Releases Lock.
- User B acquires Lock → Checks Seat → “Sold Out”.
2. Efficiency vs Correctness
Before you implement a lock, you must ask: “What happens if the lock fails?”
| Goal | Description | Consequence of Failure | Solution |
|---|---|---|---|
| Efficiency | Prevent doing the same work twice (e.g., sending email). | Minor annoyance (User gets 2 emails). | Redis (Redlock) |
| Correctness | Prevent data corruption (e.g., money transfer). | Catastrophic (Money lost). | Fencing Tokens (ZooKeeper/Etcd) |
[!WARNING] Redis is for Efficiency. If you need absolute safety (Correctness), do not rely solely on Redis. Use a consensus system like ZooKeeper or Etcd because Redis (even Redlock) makes assumptions about system clocks.
3. The Tool: Redis SETNX
The simplest distributed lock is a single atomic command in Redis.
- Command:
SET resource_name my_random_value NX PX 30000NX: Not Exists (Only set if key doesn’t exist).PX 30000: Pexpire (Auto-delete after 30s).
Why the TTL (Time To Live)?
If the client holding the lock crashes before releasing it, a lock without a TTL stays forever (Deadlock). The TTL ensures the lock auto-releases, acting as a Lease.
4. The Trap: The Ghost Writer (GC Pauses)
Here is how a simple Redis lock fails during a Garbage Collection (GC) Pause.
- Client A acquires Lock (TTL 5s).
- Client A freezes for 8s (GC Pause). Lock Expired.
- Client B acquires Lock. Writes to DB.
- Client A wakes up. Thinks it still holds the lock. Writes to DB. Result: Last Write Wins. Client A overwrites Client B’s valid data.
Sequence Diagram: The Ghost Writer
The Fix: Fencing Tokens
To solve this, we need the Storage Layer to help.
- Lock Service returns a monotonic Token (1, 2, 3…).
- Client A gets Token 33.
- Client B gets Token 34 (after A expires).
- Client A wakes up, tries to write with 33.
- Database checks: “I’ve already seen 34. Reject 33.”
5. Interactive Demo: Redlock & Time Travel
Cyberpunk Mode: Simulate the Race Condition.
- Mission: Acquire the lock and write to the Database.
- Weapon: “Freeze Ray” (Simulates GC Pause).
- Defense: Fencing Tokens (Visualized).
[!TIP] Try it yourself:
- Acquire Lock as Client A.
- Immediately hit “❄️ Freeze (GC)”. This pauses Client A for 6 seconds (longer than the 5s Lock TTL).
- Wait for the lock to expire (watch the red bar).
- Acquire Lock as Client B. Client B will write to the DB (Token 34).
- Watch Client A wake up and try to write with Token 33.
- Result: The Database triggers a “BLOCKED” shield because 33 < 34.
6. Redlock Algorithm (Multi-Master)
Single Redis is a Single Point of Failure. Redlock uses 5 independent Redis masters to solve this.
- Client gets current timestamp.
- Tries to acquire lock in all 5 instances sequentially.
- If acquired in Majority (3/5) and time elapsed < TTL:
- Lock Acquired.
- Else:
- Unlock All.
The Controversy: Kleppmann vs Antirez
Distributed Systems researcher Martin Kleppmann famously critiqued Redlock.
- The Issue: Redlock relies on Wall-Clock Time. If a server’s clock jumps forward (e.g., NTP sync), it might expire a lock prematurely.
- The Verdict:
- Use Redlock for Efficiency (preventing double-processing).
- Use ZooKeeper/Etcd for Correctness (preventing data corruption). ZooKeeper uses logical clocks (Zxid), not wall clocks.
Summary
- Distributed Locks are essential for Mutual Exclusion.
- TTL (Lease) prevents deadlocks but introduces race conditions.
- Fencing Tokens are the shield against Zombie Leaders (GC Pauses).
- Redlock is great for efficiency, but not for financial safety.