Welcome to the Databases & Scaling module.

At the Staff level, we stop asking “Postgres or Mongo?” and start asking “How does this system handle Network Partitions?” and “What is the operational cost of Resharding?”

Module Structure

1. Replication Models

  • Leader-Follower: Async vs Sync, and the “Monotonic Read” problem.
  • Leaderless (Dynamo): Quorums (W+R > N), Read Repair, and Anti-Entropy.
  • Interactive: Quorum Calculator.

2. Sharding & Partitioning

  • Partitioning Strategies: Range (TiKV) vs Hash (Cassandra).
  • Operational Pain: The “Resharding Storm” and how Virtual Nodes (vnodes) solve data skew.
  • Interactive: Consistent Hashing Simulator.

3. Consistency & CAP/PACELC

  • CAP is a Lie: Why you can’t “choose CA”.
  • PACELC: The real trade-off (Latency vs Consistency) in healthy systems.
  • Models: Linearizability vs Serializability vs Eventual.

Key Takeaway

Database scaling is a game of trade-offs. You can have strong consistency, but you pay for it in latency (PACELC). You can have infinite write scale, but you pay for it in complexity (Sharding).

Module Chapters