Now that you have the fundamentals down, it’s time to explore how to scale horizontally.

This module focuses on the macroscopic patterns for distributing requests and controlling traffic across a server fleet. These are the mechanisms that allow a service to grow from 1 server to 1,000 servers without breaking user experience.

Key topics include:

  • Load Balancing: Distributing requests across N servers using algorithms like Round Robin, Least Connections, and Weighted policies. Understanding L4 vs L7 balancing and modern data plane modes (DSR, Anycast).
  • Consistent Hashing: How DynamoDB, Cassandra, and CDNs distribute data across nodes without cascading rehashing. Advanced algorithms: Maglev (Google) and Rendezvous.
  • Rate Limiting & Backpressure: Protecting services from overload using Token Bucket, Leaky Bucket, and adaptive rate limiting. Propagating backpressure signals through the system.
  • Service Discovery & Mesh: How services find each other in dynamic fleets. Client-side vs server-side discovery, the xDS API, and the sidecar pattern (Envoy).

This module answers the question: “How do we distribute work fairly and prevent overload?”

Module Chapters