Real-Time Communication Strategies

[!TIP] Interview Tip: “How do you scale a Chat App?” is a trick question. The hard part isn’t storing messages (Database), it’s routing them to the right user who might be connected to a different server. Answer: Redis Pub/Sub (See Pub/Sub Pattern).

1. The Options

A. Short Polling (The “Are we there yet?” Kid)

Client asks every 2 seconds: “New msg?”

  • Pros: Simple. Works on everything.
  • Cons: High latency, server load (headers overhead). Wastes battery.

B. Long Polling (The “Wait for it” approach)

Client asks “New msg?”. Server holds the connection open until data arrives (or timeout).

  • Pros: Better than short polling.
  • Cons: Still re-establishes connections frequently. Headers overhead per message.

C. WebSockets (The “Phone Call”)

A persistent, bi-directional TCP connection.

  • Pros: Instant, low overhead (after handshake), Full Duplex (Send & Receive).
  • Cons: Stateful. If server crashes, connection dies. Hard to scale (Load Balancers need Sticky Sessions).

D. Server-Sent Events (SSE) (The “Radio”)

Server pushes data to Client over HTTP. Client cannot push back (must use regular POST).

E. WebRTC (The “Walkie Talkie”)

Peer-to-Peer (P2P) communication directly between browsers (Audio/Video/Data).

  • Pros: Lowest latency (UDP). Offloads server bandwidth.
  • Cons: Complex setup (ICE, STUN, TURN). Hard to record/monitor.

Interactive Demo: Protocol Racer

See the difference in “Traffic Shape”.

  • Short Polling: Spammy. Lots of Red (Overhead).
  • WebSockets: One Green Line (Persistent).
Short Polling (HTTP)
WebSockets (TCP)

2. Scaling WebSockets (The Hard Part)

WebSockets are Stateful.

  • User A connects to Server 1.
  • User B connects to Server 2.
  • User A sends “Hello”. Server 1 receives it.
  • Problem: Server 1 doesn’t know about User B. User B is on Server 2.

The Solution: Pub/Sub (Redis)

We need a “Message Bus” connecting all servers.

  1. Server 1 receives message from User A.
  2. Server 1 publishes to Redis channel room_1.
  3. Server 2 (subscribed to room_1) receives the event.
  4. Server 2 pushes message to User B.

2.5 The Load Balancer Trap: Sticky Sessions

If Client A connects to Server 1 via WebSocket, that connection is persistent. If the connection drops and Client A reconnects, the Load Balancer might send them to Server 2.

  • Problem: Server 2 doesn’t know who Client A is (Session data is on Server 1).
  • Solution: Sticky Sessions (Session Affinity). The LB hashes the Client IP and ensures they always go to the same server.

Interactive Demo: Sticky vs Round Robin

  • Sticky OFF: Clients bounce between servers.
  • Sticky ON: Client Red always goes to Server 1. Client Blue always goes to Server 2.
Red
Blue
LB
S1
S2
Click on a Client to connect

[!WARNING] The Thundering Herd: When a server restarts, 1 Million connected WebSocket clients will instantly try to reconnect. This DDoS attack (by your own users) can take down your Auth Service and Load Balancer. Fix: Add a random “Jitter” (delay) to the client reconnection logic (e.g., reconnect in Random(0, 30) seconds).

Interactive Demo: The Thundering Herd

Simulate a server crash and reconnection strategy.

  • No Jitter: Everyone reconnects at T=0. Server Spikes to 100% CPU.
  • With Jitter: Reconnections spread out. Server stays stable.
Server CPU Load
0%

Interactive Demo: Distributed Chat

Visualize the “Pub/Sub” flow.

  1. Alice is on Server A. Bob is on Server B.
  2. Type a message for Alice.
  3. Watch it travel: Alice -> Server A -> Redis -> Server B -> Bob.
Scaling State: Redis Pub/Sub Flow
💻
Alice
🏠
Server A
🔴
Redis
🏠
Server B
📱
Bob
✉️

System Walkthrough: The Life of a Chat Message

How does a message get from Alice to Bob, Charlie, and Dave? This is the Fanout pattern.

  1. Client (Alice): Sends JSON to Server A (Persistent WebSocket).
    { "type": "msg", "room_id": "101", "text": "Hello Everyone" }
    
  2. Server A: Does NOT look for Bob. It simply Publishes to Redis.
    PUBLISH room:101 '{"u":"Alice","t":"Hello Everyone"}'
    
  3. Redis: Fanout. It checks who is listening to room:101.
    • Server B is listening.
    • Server C is listening.
  4. Server B: Receives event. Checks its local WebSocket list.
    • “Ah, Bob is connected to me on Socket ID 99.” -> Pushes data to Bob.
  5. Server C: Receives event. Checks its local WebSocket list.
    • “Ah, Charlie is here.” -> Pushes data to Charlie.

3. Keeping it Alive: The Heartbeat

WebSockets are persistent. But if the WiFi drops silently, the Server might think the connection is open for hours (wasting resources).

  • The Solution: Application-Level Pings.
  • Client: Sends PING every 30s.
  • Server: Replies PONG.
  • If Server misses 3 PINGs -> Close Socket.

The Heartbeat (Active Liveness)

If the client crashes silently (e.g., WiFi off), the Server thinks the connection is still open. We must send periodic PINGs.

💻
CLIENT
🏠
SERVER
LIVENESS: OK

4. The Future: WebTransport (HTTP/3)

WebSockets are built on TCP. This means they suffer from Head-of-Line Blocking (if one packet is lost, everything waits). WebTransport is the modern alternative built on HTTP/3 (QUIC).

Why WebTransport?

  1. Datagrams: You can send fire-and-forget UDP-like packets (great for gaming).
  2. Streams: You can open multiple reliable streams (like HTTP/2). If one stream stalls, others keep going.
  3. Single Handshake: It reuses the HTTP/3 connection. No separate TCP handshake.

[!NOTE] Adoption: WebTransport is still new but rapidly gaining support for high-performance use cases (Cloud Gaming, Real-Time Trading).

Comparison Table

Feature Short Polling WebSockets SSE WebRTC WebTransport
Protocol HTTP/1.1 TCP HTTP/1.1 UDP/TCP HTTP/3 (QUIC)
Direction Client Pull Bidirectional Server Push P2P (Bidirectional) Bidirectional
Latency High Low Low Lowest (UDP) Low (UDP/QUIC)
Complexity Low High (Stateful) Medium Very High High
Use Case Dashboards Chat, Games Notifications Zoom/Video Cloud Gaming, Trading