A junior engineer says “We’re using HTTP/2, so we get connection reuse.” A Staff engineer asks: “What’s your load balancer doing with those multiplexed streams?”

Modern protocols like HTTP/2, gRPC, and WebSockets fundamentally change how traffic behaves under load balancing. Understanding the difference between connections, streams, and messages is critical for building low-latency services.


1. HTTP/2 Multiplexing & Head-of-Line Blocking

The Promise: One Connection, Many Streams

HTTP/1.1 requires one TCP connection per request (or uses pipelining, which has its own issues). HTTP/2 allows multiple streams over a single TCP connection.

Benefit: No connection setup overhead (3-way handshake, TLS negotiation) for every request.

The Hidden Cost: TCP Head-of-Line Blocking

Even though HTTP/2 streams are independent at the application layer, they ALL share one TCP connection. If one packet is lost, TCP stops delivering ALL streams until that packet is retransmitted.

Example:

  • Stream 1: Downloading a 10MB video
  • Stream 2: Fetching a 1KB JSON API response
  • If a packet from the video is lost, the JSON response is blocked until the video packet is recovered.

[!IMPORTANT] This is why QUIC (HTTP/3) moved to UDP—each QUIC stream has independent loss recovery, eliminating head-of-line blocking at the transport layer.


2. gRPC Streaming & Long-Lived Connections

Connection Pinning

gRPC clients typically open a few long-lived connections and reuse them for many RPCs (via HTTP/2 multiplexing).

Problem for Load Balancers:

  • L4 load balancers balance connections, not streams.
  • If Client A opens 1 connection to Server 1, ALL of Client A’s RPCs go to Server 1, even if Server 2 is idle.

Example: 10 clients, each with 1 connection.

  • Round-robin connection-level LB might send 7 connections to Server 1, 3 to Server 2.
  • Result: 70% of traffic goes to Server 1, even though you have 2 servers.

The Fix: Client-Side Load Balancing

gRPC clients can use the grpclb or xDS protocol to:

  1. Query a control plane for the list of backend IPs.
  2. Open connections to multiple backends.
  3. Balance RPCs (not connections) across those backends.

3. WebSockets & Sticky Sessions

WebSockets are full-duplex, long-lived connections that stay open for minutes to hours. Unlike HTTP requests, they can’t be easily “rebalanced” mid-flight.

The Session Affinity Problem

If a WebSocket connection is terminated (server restart, connection draining), the client must:

  1. Reconnect to the load balancer.
  2. Hope to land on the same server (if session state is in memory).

Without Session Affinity: Client reconnects and lands on Server B, which has no context. The user’s shopping cart or game state is lost.

With Session Affinity (Sticky Sessions): The load balancer uses a cookie or client IP hash to route the client back to the same server.

Trade-off: Sticky sessions reduce load balancing effectiveness (you can’t evenly distribute new connections if old ones are “stuck”).


4. Interactive: Connection & Stream Load Balancer Visualizer

See how protocol choice affects load distribution.

Load Distribution


5. Protocol Selection Trade-offs

Protocol Connection Model Best For Load Balancing Challenge
HTTP/1.1 One conn per request Simple REST APIs Many connections (resource overhead)
HTTP/2 Few long-lived conns Low-latency APIs L4 LBs can’t balance streams
gRPC Few long-lived conns Microservices (RPC) Requires client-side LB or L7 LB
WebSockets Long-lived, bidirectional Real-time (chat, games) Sticky sessions required
QUIC/HTTP/3 UDP-based, per-stream recovery Mobile, lossy networks Still emerging support

Staff Takeaway

Protocol choice isn’t just about “features”—it fundamentally changes your operational model:

  • HTTP/2 requires stream-aware load balancing or client-side LB.
  • gRPC needs the xDS control plane or manual connection management.
  • WebSockets demand sticky sessions and connection draining strategies.

Understanding these nuances is the difference between theoretical “high performance” and actual p99 latency in production.