Case Study: The “YouTube View Counter”
Note: This case study follows the PEDALS framework and is optimized for interview preparation.
1. Process (Clarification)
Problem: Design a Distributed Counter service that tracks Video Views at massive scale. Real-World Example: “Despacito” on YouTube has 8 Billion views. During its peak, it was receiving > 50,000 views/second. The Challenge: A standard database row cannot handle 50k updates/second due to lock contention.
2. Estimate (Capacity)
- DAU: 1 Billion Users.
- Average Views: 5 videos / user / day = 5 Billion views / day.
- QPS (Average): 5 × 109 / 86400 ≈ 58,000 writes/sec.
- Peak QPS: Assume 5x multiplier = 300,000 writes/sec.
- Storage:
- Video ID (8 bytes) + Count (8 bytes) = 16 bytes per video.
- 1 Billion Videos = 109 × 16 bytes ≈ 16 GB (Hot data fits in Memory).
3. Design (High-Level)
The architecture splits the “Write Path” (High Throughput) from the “Read Path” (High Availability).
- Client sends
incrementrequest to Load Balancer. - API Service receives request.
- Fast Path: API Service increments a counter in Redis Cluster.
- Slow Path: A background worker (“The flusher”) reads from Redis and updates the SQL Database every N seconds.
4. Articulate (Deep Dive)
The “Hot Key” Problem
The naive approach uses a simple Redis INCR:
INCR video:123:views
Problem: Redis is single-threaded. A single key lives on a single Redis node. That node can handle ~50k - 100k RPS. If “Despacito” gets 300k RPS, the single Redis node will melt (CPU 100%).
Solution: Counter Sharding
We split the single counter into N sub-counters.
- Key:
video:123:views-> Shards:video:123:views:0…video:123:views:9 - Write: Pick a random shard
iandINCR. - Read:
GETall shards and sum them up.
# Write Path (O(1))
shard_id = random.randint(0, N-1)
redis.incr(f"video:{vid}:views:{shard_id}")
# Read Path (O(N))
total = 0
for i in range(N):
total += int(redis.get(f"video:{vid}:views:{i}") or 0)
return total
Write-Behind (Buffering)
For 300k RPS, even sharding Redis might be expensive. Strategy:
- Local Buffer: Each API Server keeps a local
Map<VideoID, Count>. - Batch: It accumulates writes for 5 seconds.
- Flush: Every 5 seconds, it sends one
INCRBYcommand to Redis with the aggregated value (e.g., +50).
5. List (Trade-offs) & Interactive Decision Visualizer
Compare Global Lock vs Sharded Counters vs Write-Behind. Monitor the System Health (Error Rate & Latency).
6. Scale (Summary)
- Client -> LB -> API
- API -> Redis (Sharded)
- Worker -> SQL (Async)
- **Write-Behind**: Speed vs Safety.
- **Sharding**: Scale vs Complexity.
- **Redis**: Latency vs Cost.