Welcome to the Databases & Scaling module.
At the Staff level, we stop asking “Postgres or Mongo?” and start asking “How does this system handle Network Partitions?” and “What is the operational cost of Resharding?”
Module Structure
1. Replication Models
- Leader-Follower: Async vs Sync, and the “Monotonic Read” problem.
- Leaderless (Dynamo): Quorums (
W+R > N), Read Repair, and Anti-Entropy. - Interactive: Quorum Calculator.
2. Sharding & Partitioning
- Partitioning Strategies: Range (TiKV) vs Hash (Cassandra).
- Operational Pain: The “Resharding Storm” and how Virtual Nodes (vnodes) solve data skew.
- Interactive: Consistent Hashing Simulator.
3. Consistency & CAP/PACELC
- CAP is a Lie: Why you can’t “choose CA”.
- PACELC: The real trade-off (Latency vs Consistency) in healthy systems.
- Models: Linearizability vs Serializability vs Eventual.
Key Takeaway
Database scaling is a game of trade-offs. You can have strong consistency, but you pay for it in latency (PACELC). You can have infinite write scale, but you pay for it in complexity (Sharding).
Module Chapters
Databases 101: Excel to Postgres
Why can't we just save everything in a text file? Learn how databases keep your data organized, searchable, and safe from crashes.
Start Learning →Sharding: The Partitioning Nightmare
Scaling horizontally by splitting data. Horizontal vs Vertical partitioning, and avoiding the 'Hot Spot' problem.
Start Learning →Consistency & CAP: The Hard Truths
Understanding the trade-offs of distributed state. Strong vs Eventual consistency and the truth about CAP vs PACELC.
Start Learning →Replication: Leader vs. Leaderless
How to keep data in sync across nodes. Synchronous vs Asynchronous replication, and the cost of the 'Lag'.
Start Learning →