Module Review: Load Balancing

🧠 Flashcards

Test your recall. Click a card to flip it.

L4 Load Balancing

Tap to reveal

Transport Layer

Routing based on IP & Port only. Fast but "dumb". Does not decrypt SSL (TCP Passthrough). Uses eBPF/XDP for speed.

L7 Load Balancing

Tap to reveal

Application Layer

Routing based on URL, Headers, Cookies. Smart but CPU heavy. Requires SSL Termination (Decryption).

Maglev Hashing

Tap to reveal

Google's Consistent Hashing

Uses a massive permutation table to achieve O(1) lookup time for distributing packets. Superior to Ring Hashing at scale.

Active-Passive

Tap to reveal

High Availability

One LB handles traffic. The other sleeps. If Active dies, Passive wakes up via Heartbeat check (VRRP/Keepalived).

Connection Pooling

Tap to reveal

Latency Optimization

The LB keeps connections to the backend open (Keep-Alive) to avoid paying the TCP 3-Way Handshake cost for every request.

Consistent Hashing

Tap to reveal

Scaling Strategy

A hash ring strategy that minimizes data movement when adding/removing servers. Crucial for Distributed Caches.

Sidecar Proxy

Tap to reveal

Service Mesh

A reverse proxy attached to every service instance (e.g., Envoy). Handles mTLS, Retries, and Observability.

Peak EWMA

Tap to reveal

Stability Metric

Exponential Weighted Moving Average. Used by Linkerd to detect slow servers while ignoring short-lived spikes.

QUIC (HTTP/3)

Tap to reveal

UDP Protocol

Modern protocol running on UDP. Challenges L4 LBs because it requires tracking Connection IDs (CIDs) instead of IP tuples.

TLS Fingerprinting

Tap to reveal

Security (JA3)

Identifying clients (e.g., Bots vs Browsers) by analyzing the specific parameters of their SSL Client Hello handshake.

SNI

Tap to reveal

Server Name Indication

Allows L4 Load Balancers to peek at the hostname during the TLS Handshake without full decryption.

Thundering Herd

Tap to reveal

Concurrency Problem

When many processes wake up simultaneously to handle an event (or reconnect), overwhelming the system. Solved by Jitter.

GSLB

Tap to reveal

Global Server Load Balancing

Distributing traffic across data centers worldwide using DNS (GeoDNS) or Anycast (BGP) to reduce latency.

Bounded Load

Tap to reveal

Consistent Hashing Optimization

A technique to prevent hot shards by rejecting requests to an overloaded node and passing them to the next peer on the ring.


📝 Scenario Quiz

1. You are designing a video streaming service (Netflix). You need maximum throughput for video chunks. Which LB do you choose?

2. You have a Microservices architecture where `/api` goes to Service A and `/payment` goes to Service B. Which LB is required?

3. Your backend servers have varying hardware specs (some fast, some slow). Which algorithm is BEST?

4. You need to process 10M packets per second for a DDoS scrubber. The standard Linux Kernel is too slow. What technology do you use?

5. You want to detect if a client is a Bot or a real Chrome browser, even if they spoof the User-Agent. What technique helps?


📋 Cheat Sheet

L4 vs L7

Feature L4 Load Balancer L7 Load Balancer
Layer Transport (TCP/UDP) Application (HTTP)
Visibility IP & Port (Envelope) URL, Headers, Body (Content)
Speed Ultra High (eBPF) Slower (CPU Intensive)
Decryption No (Pass-through) Yes (SSL Termination)
Caching Impossible Possible (Static Files)

Concepts

Concept Definition
SPOF Single Point of Failure. If the LB dies, the site dies.
Sticky Session Ensuring a user’s requests always go to the same server (via IP Hash or Cookie).
Maglev Google’s Consistent Hashing algorithm for O(1) lookups.
Least Conn Smart routing to the server with fewest active connections.
P2C Power of Two Choices. Pick 2 random servers, choose the best. O(1) efficiency.
Peak EWMA Peak Exponential Weighted Moving Average. Reacts quickly to latency spikes.
Active-Passive High Availability setup where a backup LB takes over if the primary fails.
Sidecar Proxy A helper proxy (Envoy) that runs alongside a service to handle network logic.
Connection Pooling Reusing persistent TCP connections to avoid handshake overhead.
GSLB Global Server Load Balancing. Using DNS or Anycast to route users to the closest datacenter.
Bounded Load Consistent Hashing optimization to prevent hot shards.
JA3 TLS Fingerprinting standard used to identify the client application (e.g., bot vs browser).
QUIC New UDP-based protocol (HTTP/3) that improves performance but complicates L4 load balancing.

Technology Choice

Tool Best For
Nginx General purpose web server, Static files, Simple L7 LB.
HAProxy High performance, pure LB. Best for massive scale TCP/HTTP.
Envoy Service Mesh (Sidecar). Observability, Distributed Tracing.
Traefik Kubernetes/Docker Ingress. Auto-discovery.
Katran Facebook’s eBPF-based L4 Load Balancer.

🏗️ Whiteboard Summary

1. The Problem

  • Vertical Scaling fails (Kitchen Fire).
  • DNS Round Robin fails (Caching).
  • Need a Single VIP entry point.

2. Architecture

  • L4: Fast, Encrypted, Dumb.
  • L7: Smart, Decrypted, Slow.
  • Active-Passive: For HA.

3. Algorithms

  • Round Robin: Simple.
  • Least Conn: Variable workloads.
  • P2C: Hyperscale (O(1)).
  • Maglev: Google Scale.

4. Optimization

  • Health Checks: Deep vs Shallow.
  • Conn Pooling: Reduce Handshakes.
  • Draining: Zero Downtime Deploy.
  • Security: WAF & JA3.