When simple replication and sharding aren’t enough to handle global consistency, we move into the domain of Advanced Distributed Patterns.
This module covers the “final boss” topics of system design: how to reach agreement in a cluster with failing nodes, how to order events across the globe without a perfectly sync’d clock, and how to manage complex multi-step transactions across independent services.
Chapters
1. Consensus: Raft & Paxos
- Reaching Agreement: How etcd and CockroachDB stay consistent.
- Operational Nuances: Raft Pre-Vote and Joint Consensus for membership changes.
- Interactive: Raft Pre-Vote Election Simulator.
2. Distributed Time: HLC vs TrueTime
- The Order Problem: Why NTP isn’t enough for serializability.
- Modern Solutions: Hybrid Logical Clocks (HLC) and Google’s TrueTime (GPS + Atomic).
- Interactive: HLC Uncertainty Wait Simulator.
3. Distributed Transactions: TCC vs Sagas
- Atomicity Across Services: When to use Sagas vs TCC (Try-Confirm-Cancel).
- Trade-offs: Isolation levels vs operational complexity.
- Interactive: TCC Booking Flow vs Saga Rollback Comparison.
Module Chapters
Consensus: Raft & Paxos
Deep dive into Raft and Paxos. Understanding why 'Pre-Vote' is essential for production etcd clusters and how Joint Consensus handles membership changes.
Start Learning →Clock Sync & Distributed Time
Why clocks in distributed systems are never in sync. Comparing Hybrid Logical Clocks (CockroachDB) and Google's TrueTime (GPS+Atomic).
Start Learning →Distributed Transactions: TCC & Sagas
How to maintain atomicity across boundaries without a global lock. Comparing Try-Confirm-Cancel (TCC) to the Saga pattern.
Start Learning →