Pub-Sub Pattern

[!NOTE] This module explores the core principles of Pub-Sub Pattern, deriving solutions from first principles and hardware constraints to build world-class, production-ready expertise.

1. Beyond 1-to-1 Messaging

In a standard Message Queue (Point-to-Point), one message is processed by one consumer. This is great for work distribution (e.g., “Resize this image”).

But what if multiple services need to know about an event?

  • User Signs Up:
  • Email Service needs to send a welcome email.
  • Analytics Service needs to log the event.
  • Fraud Service needs to check the IP.

If we use a standard queue, only one of them will get the message. We need Pub-Sub.

2. The Pub-Sub Model

Publish-Subscribe decouples the sender (Publisher) from the receivers (Subscribers). Think of it like a Radio Station: The DJ (Publisher) broadcasts, and anyone tuned in (Subscribers) hears it.

Core Components

  1. Publisher: Sends a message to a Topic (not a specific queue).
  2. Topic (Exchange): The “Post Office” that decides where to route the message.
  3. Subscription: Services “subscribe” to the topic.
  4. Fanout: The broker copies the message to ALL subscribers.

3. Exchange Types Deep Dive

Not all Pub-Sub is the same. In AMQP (RabbitMQ), the “Topic” is actually called an Exchange. There are 4 main types:

Exchange Type Routing Logic Use Case Complexity
Direct Exact Match (key == binding) Unicast routing (e.g., error.log → ErrorQueue). O(1) (Fast Hash)
Fanout Broadcasts to ALL queues. Ignores key. “Mass Notification” (e.g., New User Signup). O(1) (Blind Copy)
Topic Pattern Match (*, #). Multicast routing (e.g., payment.us.*). O(N) (String Parsing)
Headers Matches metadata headers. Complex routing logic beyond string keys. O(N) (Header Check)

[!TIP] Performance: Fanout is the fastest because it blindly copies messages. Topic exchanges are slower because they must parse the string pattern against the routing table (Trie data structure).


4. Interactive Demo: Topic Exchange & Wildcards

Visualize how messages are routed using Routing Keys.

[!TIP] Try it yourself: Click different event buttons. Watch how the Wildcard (*, #) subscribers filter the traffic.

Publisher (Payment Gateway)
Tap to emit event
🔀
TOPIC
Sub: # (All)
📜
Audit Log
0
Sub: payment.us.*
🇺🇸
US Region
0
Sub: *.error
🚨
PagerDuty
0

5. Why is this powerful?

1. Loose Coupling (Plug & Play)

If we need to add a new Recommendation Service next month, we don’t change the Payment Gateway code. We just add a new subscriber. The Publisher doesn’t know (or care) who is listening.

2. Parallel Processing

All subscribers receive the message simultaneously (or near simultaneously).

  • Email sent: 200ms
  • Analytics logged: 50ms
  • Fraud check: 100ms Total Time: Max(200, 50, 100) = 200ms (Parallel), instead of 350ms (Sequential).

3. Fanout Cost Analysis

Be careful with Fanout. If you have 1 event and 100 subscribers, the broker must create 100 copies.

  • CPU Cost: Low (it’s just a pointer copy).
  • Network Cost: High (100x bandwidth).
  • Storage Cost: High (if durable queues are used).

6. Filtering Strategies

Where should the filtering happen? This is a massive trade-off.

A. Broker-Side Filtering (Efficient)

The broker (Exchange) decides which queue gets the message.

  • Mechanism: The Exchange checks the Routing Key against Binding Keys (using a Trie).
  • Example: RabbitMQ Topic Exchange.
  • Logic: If no one binds to sys.debug, the message is discarded immediately at the broker.
  • Pros: Saves network bandwidth and consumer CPU.
  • Cons: Broker works harder (higher CPU usage on the MQ server).

B. Consumer-Side Filtering (Wasteful)

The consumer receives everything and filters it in code.

  • Mechanism: All services listen to a “Firehose”.
  • Logic: if (msg.type ≠ 'error') return;
  • Pros: Dumb broker (fast, high throughput).
  • Cons: Massive waste of bandwidth. The consumer wakes up, deserializes JSON, and discards it.

[!TIP] Always prefer Broker-Side Filtering for high-volume systems to save bandwidth. Only send data to services that actually need it.

7. Understanding Wildcards

Sometimes you don’t want all messages. You want to subscribe to a subset.

Wildcards (RabbitMQ Example)

  • Topic format: service.region.status
  • Wildcard * (Star): Matches exactly one word.
  • payment.us.* matches payment.us.success but NOT payment.us.db.error.
  • Wildcard # (Hash): Matches zero or more words.
  • payment.# matches payment.error, payment.us.success, and payment.debug.level.1.

8. Delivery Semantics

When doing Pub-Sub, you must decide your guarantee level:

Semantics Description Pros Cons
At-Most-Once Fire and Forget. Message might be lost. Fastest. No state tracking. Data loss possible.
At-Least-Once Retries until ack received. No data loss. Duplicates (Requires Idempotency).
Exactly-Once Hardest to achieve. Transactional. Perfect consistency. High latency, complex.

9. Summary

  • Queues for 1-to-1 work distribution.
  • Pub-Sub for 1-to-Many notifications.
  • Topic Exchanges allow for powerful routing logic using Wildcards.
  • Prefer Broker-Side Filtering to avoid flooding consumers with irrelevant data.