Module Review: Producers

[!NOTE] This module explores the core principles of Module Review: Producers, deriving solutions from first principles and hardware constraints to build world-class, production-ready expertise.

1. Key Takeaways

  • Asynchronous Nature: The producer’s send() method is async. It batches messages in the RecordAccumulator and a background Sender Thread transmits them.
  • Partitioning:
  • Key-Based: Hashes the key (Murmur2) to ensure ordering per key.
  • Sticky: Sticks to one partition to fill batches, improving throughput.
  • Reliability:
  • acks=all: Ensures all ISRs have the data.
  • min.insync.replicas: Enforces the minimum replication for acks=all.
  • enable.idempotence=true: Prevents duplicates and ensures ordering per partition.
  • Tuning:
  • linger.ms > 0: Wait for batches to fill (High Throughput).
  • linger.ms = 0: Send immediately (Low Latency).
  • compression.type: Use lz4 or zstd to save bandwidth.

2. Flashcards

Test your knowledge. Click a card to flip it.

What is the benefit of the Sticky Partitioner?

(Click to reveal)

It reduces latency and increases throughput by sticking to one partition until a batch is full, minimizing the number of requests sent to brokers.

What does `min.insync.replicas=2` protect against?

It ensures that a write is only successful if at least 2 replicas acknowledge it. This prevents data loss if one replica fails later.

Why should you enable Idempotence?

It guarantees exactly-once delivery per partition by assigning sequence numbers to messages, preventing duplicates during retries.

What happens if `buffer.memory` fills up?

The `send()` method blocks for `max.block.ms`. If it still can't clear space, it throws a TimeoutException.


3. Cheat Sheet

Parameter Recommended Value (General) Why?
acks all Durability.
enable.idempotence true Prevent duplicates.
retries Integer.MAX_VALUE Don’t give up on transient errors.
linger.ms 20-100 Improve throughput via batching.
batch.size 32768 (32KB) Larger batches = better compression.
compression.type lz4 or zstd Save network bandwidth.
buffer.memory 33554432 (32MB) Buffer backpressure. Increase if needed.

4.

Learn how to read data at scale with Consumer Groups.

View Full Glossary