Most engineers treat the network as a “black box” that just works. Staff engineers treat it as a resource to be tuned. How you handle congestion and connection overhead determines your system’s “Floor” latency.


1. TCP Tuning: The “Slow Start” Problem

Every TCP connection starts in Slow Start. It begins with a small “Congestion Window” (CWND) and doubles it every time an acknowledgement (ACK) is received.

Staff Pitfall: Connection Overhead

If your API returns a 1KB JSON but the TCP handshake + Slow Start takes 3 Round Trips (RTTs), your “Network Latency” will be 300ms even if your database responds in 1ms.

The Fixes:

  • Keep-Alives: Reuse existing connections to avoid the handshake and slow-start penalty.
  • TCP Fast Open (TFO): Allows data to be sent inside the very first SYN packet.
  • Kernel Tuning: Adjusting tcp_tw_reuse and buffer sizes allows a single machine to handle 100k+ concurrent connections without running out of ports.

2. BBR: Google’s Congestion Control

Historically, TCP used “Loss-based” congestion control (like CUBIC). It assumed that if a packet was lost, the network was full, so it slashed its speed by 50%.

In the modern world (shallow buffers, high-speed fiber), packet loss doesn’t always mean congestion.

BBR (Bottleneck Bandwidth and RTT) is a “model-based” algorithm released by Google in 2016.

  • How it works: It constantly measures the maximum bandwidth and the minimum RTT. It ignores random packet loss and continues at full speed as long as the RTT doesn’t spike.
  • Impact: Up to 20% higher throughput and 80% lower tail latency on messy networks (like mobile/cellular).

[!TIP] Enable BBR today. Most modern Linux kernels (4.9+) support it. Adding net.core.default_qdisc=fq and net.ipv4.tcp_congestion_control=bbr to your sysctl.conf is one of the highest-ROI changes you can make.


3. QUIC and HTTP/3: The End of TCP?

TCP has a fundamental flaw called Head-of-Line (HoL) Blocking.

Imagine an HTTP/2 connection sending 3 files (A, B, C) over a single TCP stream.

  1. Packet from File A is lost.
  2. The OS stops delivering File B and C to the application until A is retransmitted.
  3. The Result: One slow packet kills the performance of the entire “multiplexed” connection.

The QUIC Solution

QUIC runs over UDP. It implements its own reliability logic but treats every “stream” (file) independently.

  • HoL Fixed: If a packet for File A is lost, File B and C continue to be processed.
  • 0-RTT Handshake: QUIC can resume a previous connection and send data in the first packet.

Staff Takeaway

You cannot build a high-performance system on default settings.

  • TCP is reliable but “blind” to modern network physics.
  • BBR is a mandatory upgrade for any high-traffic service.
  • HTTP/3 isn’t just a version bump; it’s a fundamental move to UDP to bypass the architectural limits of the 1970s networking stack.