Review & Cheat Sheet

[!IMPORTANT] In this lesson, you will master:

  1. Interactive Flashcards: Rapid-fire testing on CAP, PACELC, and Sharding mechanics.
  2. Scaling Cheat Sheet: A consolidated hardware-first reference for choosing replication and hashing strategies.
  3. Scenario Quiz: Real-world decision-making for banking, social media, and e-commerce architectures.

1. Interactive Flashcards

Test your knowledge. Click a card to reveal the answer.


Scaling Decision Tree (Elite Standard)

Ask the following questions in order:

  1. Does it fit on one machine? → Yes? Vertical Scaling (Scale Up).
  2. Is it Read-Heavy? → Yes? Read Replicas (Scale Out Reads).
  3. Is it Write-Heavy (>50k RPS)? → Yes? Sharding or Sharded Counters.
  4. Is it Multi-Region? → Yes? CRDTs (G-Counter/PN-Counter) or TrueTime.
  5. Is Partition Tolerance Mandatory? → Yes? PACELC (Choose Latency vs Consistency).

Scaling Patterns

Pattern Concept Use Case Trade-off
Vertical (Scale Up) Bigger Hardware Startups, Monoliths NUMA Bottleneck, High Cost
Horizontal (Scale Out) More Nodes Big Tech, Stateless Apps Network Overhead, Complexity
Sharding Partition Data massive DBs (>5TB) Hot Partitions, No Joins
Virtual Buckets Indirection Layer Couchbase, Cassandra Complex Mapping Logic

Theorems & Models

Model Key Idea
CAP Pick 2: Consistency, Availability, Partition Tolerance (Always Pick P).
PACELC Extends CAP. “Else” (Healthy) → Latency vs Consistency.
Spanner TrueTime Uses Atomic Clocks to minimize uncertainty window (<7ms).
Quorum R + W > N guarantees overlap (Strong Consistency).

Replication & Consistency

Type Speed Durability Cons
Chain Replication Medium High (All nodes ack) High Latency (Tail latency)
Async Fast Low (Risk of Data Loss) Replication Lag
Read Repair N/A High (Self-healing) Extra Read Overhead
Sloppy Quorum Fast Medium (Hinted Handoff) Possible Data Loss

3. Scenario Quiz

4. Hardware-First Scaling Checklist

Before you shard or replicate, verify these physical limits:

  • I/O Queue Depth: Is your range-based sharding saturating the SSD controller (Hotspot)?
  • NIC Bandwidth: Will synchronous replication consume more than 50% of your 10Gbps/100Gbps physical link?
  • CPU Cache Locality: Are your Consistent Hashing VNodes small enough to fit in the L2 cache for microsecond routing?
  • NUMA Boundaries: Have you hit the memory bridge bottleneck on your vertical scaling target?

5. Staff Engineer Challenge: The “Global Clock” Dilemma

The Scenario: You are building a high-frequency trading platform across NY and London.

  • The Target: Strong Consistency across regions.
  • The Constraint: The speed of light (RTT is ~60ms).

The Question: If you use a standard QUORUM (Majority) write, your p99 latency is 60ms. How would you redesign the Hardware/Software layer to achieve “Simulated CA” behavior like Google Spanner?

Hint: Think about Atomic Clocks (TrueTime) and how “Commit Wait” allows you to trade a few milliseconds of local CPU sleep for global consistency without a 2PC (Two-Phase Commit) lock.


6. 🔗 Next Steps

🎉 Module Complete: Data Scaling

You have mastered Data Scaling. Now, let's move to Module 8: Messaging & Async Communication to learn how to decouple these systems using Queues and Event Streams.

Start Module 8