Redis vs Memcached
In 2013, Stack Overflow served 560 million page views per month with just 25 servers — the secret was Memcached. Their entire hot-path: 1 web server + 1 Memcached instance, 99.9% of requests never touched SQL. In 2016, Twitter’s Twemcache (a Memcached fork) was replaced by Redis because product teams needed sorted sets for trending topics and pub/sub for real-time notifications — features Memcached couldn’t provide. The choice between Redis and Memcached is deceptively simple on the surface but determines your entire caching architecture.
[!TIP] Interview Insight: Don’t just list features. Explain architectural differences. “Redis is single-threaded, which avoids lock contention. Memcached is multi-threaded, which scales vertically on big CPUs.”
[!TIP] Analogy: Think of Memcached as a giant, highly-efficient whiteboard. It’s blazing fast, anyone can write on it, but when the cleaning crew comes (power failure), everything is wiped clean instantly.
Think of Redis as a smart filing cabinet with a dedicated clerk. The clerk (single thread) ensures no two people mess with the same file at once, can organize files by priority (Sorted Sets) or lists, and painstakingly photocopies the cabinet’s contents at night so nothing is ever lost (Persistence).
[!IMPORTANT] In this lesson, you will master:
- Single-Threaded Speed: Why avoiding “Lock Contention” and “Context Switching” allows Redis to handle 100k+ TPS.
- Persistence Trade-offs: Choosing between the speed of
RDBsnapshots and the durability ofAOFlogs.- Hardware Intuition: Understanding CPU Cache Locality and the Linux
fork()COW (Copy-On-Write) mechanism.
1. Memcached (The Simple Giant)
Memcached is a pure, in-memory key-value store. It is “dumb” by design.
- Architecture: Multi-threaded. It uses a master thread to accept connections and worker threads to handle requests.
- Memory Management: Uses Slab Allocation to reduce memory fragmentation. It pre-allocates chunks of memory (slabs) for different data sizes.
- Data Types: Only Strings (Blobs).
- Replication: None. If a node dies, data is gone.
2. Redis (The Smart Swiss Army Knife)
Redis (Remote Dictionary Server) is a data structure server.
- No Locks: You never need
mutexlocks for data access because only one command runs at a time.
[!NOTE] Hardware-First Intuition: By using a single thread, Redis achieves extreme CPU Cache Locality. Because the same thread processes every request, the OS likely keeps it “pinned” to one physical CPU core. This meant the L1 and L2 Caches remain “warm” with Redis code and data structures. In a multi-threaded system, the thread would constantly jump between cores, forcing the hardware to perform expensive Cache Invalidation cycles to synchronize memory across the CPU die.
B. Data Structures
- Lists: For Queues.
- Sets: For unique friends.
- Sorted Sets (ZSET): For Leaderboards.
- HyperLogLog: For counting unique items (DAU) with 99% accuracy in tiny memory.
C. Persistence (RDB vs AOF)
Redis is in-memory, but it saves to disk. How?
- RDB (Snapshot): Forks a child process to dump RAM to disk every X minutes. Fast startup, but potential data loss.
- AOF (Append Only File): Logs every write. Slower startup, but durable.
Deep Dive: How Redis Survives Crashes
RDB (Redis Database Snapshot):
- Trigger: Every
save 900 1(900s if 1 key changed) or manualBGSAVE - Mechanism:
fork()creates a child process with Copy-On-Write memory - Child writes entire dataset to
/dump.rdbon disk - Atomicity: Rename temp file to
dump.rdbonly when complete
Trade-offs:
- ✅ Fast recovery: Loading binary is fast (1M keys/sec)
- ✅ Small file size: Compressed binary format
- ❌ Data loss window: If Redis crashes 10min after last save, you lose 10min of writes
- ❌ Fork cost: On 10GB dataset,
fork()can pause Redis for 1 second
AOF (Append-Only File):
- Logs: Every
SET user:1 Aliceis appended to/appendonly.aof - Sync Policy:
appendfsync always: Sync after every write (slow, but ZERO data loss)appendfsync everysec: Sync every 1s (default, balances speed vs safety)appendfsync no: Let OS decide (fast, but risky)
- Rewrite: AOF grows huge. Redis periodically rewrites it to only include final state
Example AOF:
*3
$3
SET
$6
user:1
$5
Alice
Trade-offs:
- ✅ Durable: Max 1s of data loss (with
everysec) - ✅ Readable: Text format, can edit manually
- ❌ Slow recovery: Must replay every command (slower than RDB)
- ❌ Larger files: Text is bigger than binary
Hybrid Persistence (Redis 4.0+)
Combines RDB + AOF:
- Use RDB for base snapshot (fast recovery)
- Use AOF for changes since last snapshot (low data loss)
- On restart: Load RDB, then replay AOF delta
Config:
# Enable both
save 900 1 # RDB every 15min if 1 key changed
appendonly yes # AOF enabled
aof-use-rdb-preamble yes # Hybrid mode
Recovery Scenarios
| Scenario | RDB Only | AOF Only | Hybrid (RDB+AOF) |
|---|---|---|---|
| Normal Restart | Fast (load binary) | Slow (replay log) | Fast (RDB) + Fast (small AOF) |
| Crash (10min since save) | ❌ Lost 10min | ✅ Lost 1s | ✅ Lost 1s |
| Corrupted AOF | ✅ Still have RDB | ❌ Cannot start | ✅ Fallback to RDB |
| No Disk Space | ⚠️ BGSAVE fails silently | ⚠️ AOF sync fails | ⚠️ Both fail |
Interview Insight: Netflix uses AOF with everysec. They accept 1s of data loss for session caches, but need better durability than RDB snapshots alone.
Persistence Visualizer
See the difference in real-time.
- AOF (Log): Every write adds a line. Continuous IO.
- RDB (Snapshot): Periodically dumps the entire state. Burst IO.
[!TIP] Try it yourself: Click “Simulate Write” repeatedly. Watch AOF grow line-by-line, and RDB trigger only every 5 writes.
AOF (Append Only File)
RDB (Snapshot)
3. Distributed Caching: Sentinel vs Cluster
How do you scale Redis when one machine isn’t enough?
A. Redis Sentinel (High Availability)
- Goal: Keep the system alive if Master dies.
- Setup: 1 Master (Read/Write) + N Slaves (Read Only).
- Mechanism: Sentinels watch the Master. If it dies, they vote to promote a Slave to Master.
- Limitation: You cannot write more data than fits on one machine.
B. Redis Cluster (Sharding)
- Goal: Store MORE data than fits on one machine.
- Mechanism: Data is split into 16,384 Hash Slots.
- Distribution:
CRC16(key) % 16384. Each node owns a range of slots. - Smart Clients: Clients know which node holds which slot and connect directly.
Sentinel (HA)
Cluster (Sharding)
C. Consistent Hashing (The Old Way - Memcached)
Before Redis Cluster, we used Consistent Hashing (e.g., in Memcached clients). See Consistent Hashing.
- Concept: Map both Nodes and Keys to a Hash Ring (0 to 232-1).
- Placement: A key belongs to the first Node found moving clockwise on the ring.
- Virtual Nodes: To balance load, each physical node is hashed hundreds of times (Node A_1, Node A_2…).
Difference:
- Consistent Hashing: Dynamic. Good for systems where nodes join/leave frequently (P2P, Cassandra).
- Hash Slots (Redis): Fixed 16k slots. Deterministic. Better for controlled cluster management.
D. Gossip Protocol
How do Redis Cluster nodes know about each other? They Gossip. Every node connects to a few other random nodes and exchanges information (“I am alive”, “Node B is down”). This allows the cluster to detect failures without a central “Master” server.
Interactive Demo: Consistent Hashing vs Hash Slots
- Hash Slots: Fixed buckets. Adding a node requires moving specific buckets.
- Consistent Hashing: A continuous ring. Adding a node only affects its immediate neighbor.
[!TIP] Try it yourself: Add Nodes (Blue) and Keys (Yellow). Notice how keys map to the next node clockwise.
Consistent Hashing Ring
4. Interactive Demo: Hash Slot Sharding
Visualizing the Redis Cluster Ring.
- Enter a key to see its Hash Slot (0-16383).
- Add Node: Watch how the slots (and data) are Rebalanced from Node C to the new Node D.
[!TIP] Try it yourself: Type a key like “user:100” to see which node owns it. Then add Node D to see the rebalance.
5. Performance Trick: Pipelining vs Lua Scripting
Normally, every Redis command is a separate network Round Trip (RTT).
A. Pipelining (Batching)
- Concept: Send 100 commands at once. Read 100 replies at once.
- Benefit: Saves 99 RTTs. Massive throughput increase.
- Limitation: Not Atomic. If command 50 fails, command 51 still runs.
B. Lua Scripting (Atomicity)
- Concept: Send a small script (code) to Redis. Redis executes it as a single transaction.
- Benefit: Atomic. No other command runs while the script is running. Great for complex logic (e.g., Rate Limiting “Check-and-Set”).
- Warning: If your script is slow (infinite loop), you block the entire server (Single Threaded!).
6. Summary
| Feature | Memcached | Redis |
|---|---|---|
| Data Structures | Strings only | Strings, Lists, Sets, Hashes, ZSets |
| Architecture | Multi-threaded | Single-threaded (w/ IO threads) |
| Persistence | None (Volatile) | RDB (Snapshot) & AOF (Log) |
| Replication | Client-side | Native Master/Replica |
| Cluster Mode | Client-side hashing | Native Redis Cluster (Hash Slots) |
Staff Engineer Tip: When scaling Redis, the bottleneck is often Network IO Interrupts rather than CPU. To solve this, Redis 6.0 introduced I/O Threads. The core data access remains single-threaded (safe!), but “reading from” and “writing to” the networking stack is now multi-threaded. This allows Redis to saturate a 10Gbps or 40Gbps network link which a single thread physically cannot do.
Mnemonic — “Memcached is Simple, Redis is Powerful”: Choose Memcached when you only need string key-value caching with maximum CPU vertical scale. Choose Redis when you need persistence (RDB/AOF), rich data types (ZSETs for leaderboards), Pub/Sub, or Lua scripting for atomicity. In 2024, almost every new system defaults to Redis — Memcached is a specialized choice for extreme simplicity at massive scale.
7. Case Study: Real-Time Gaming Leaderboard (PEDALS Framework)
To solidify these concepts, let’s design a real-time global leaderboard for a massively multiplayer online game using the PEDALS framework.
Process Requirements
- Functional: Players earn score points continuously. The system must display the Top 100 players globally, and any player’s absolute rank in real-time.
- Non-Functional: High throughput (100k+ writes/sec), extremely low latency for reads (<10ms), and durability (we cannot lose ranks if the cache restarts).
Estimate
- 10 million Daily Active Users (DAU).
- Each user updates their score on average 10 times per day.
- 100 million writes per day (~1,150 TPS average, up to 10k TPS peak).
Data Model
We cannot use a relational database with ORDER BY score DESC for 100k TPS—it will melt the CPU.
Instead, we use a Redis Sorted Set (ZSET).
- Key:
leaderboard:global - Score: The player’s score.
- Member: The player’s
user_id.
Architecture
- Write Path: Game servers receive score updates and issue a
ZINCRBY leaderboard:global <points> <user_id>command to Redis. - Read Path:
- Top 100: Clients request
ZREVRANGE leaderboard:global 0 99 WITHSCORES. - Player Rank: Clients request
ZREVRANK leaderboard:global <user_id>.
- Top 100: Clients request
- Persistence: We enable AOF with
everysecbecause losing 1 second of score data during a catastrophic crash is acceptable, but losing the entire leaderboard (which would happen with Memcached or infrequent RDB snapshots) is not.
Localized Details
- Single-Threaded Advantage: Because Redis is single-threaded,
ZINCRBYis atomic. If 50 game servers update scores simultaneously, Redis queues them and executes them one by one without lock contention. - Memory Efficiency: A ZSET uses a Skip List and a Hash Table under the hood. For 10 million users, it consumes roughly 100MB of RAM—easily fitting in a single Redis node.
Scale
- What if we hit 100 million users and 100k TPS?
- A single Redis node can handle ~100k TPS. If we exceed this, the CPU becomes a bottleneck.
- Sharding (Redis Cluster): We cannot shard a single
leaderboard:globalkey because ZSETs don’t span multiple nodes. - Solution: We shard the leaderboard into geographical regions (
leaderboard:na,leaderboard:eu), or use a hybrid approach where edge nodes maintain local top-K lists and periodically flush to a master aggregator.