The Collector: Architecture & Deployment

The OpenTelemetry Collector is the single most critical component in a production observability stack. While the OTel SDKs generate data, the Collector is what makes that data manageable at scale.

[!IMPORTANT] Why do you need this? Without a Collector, every microservice in your fleet is tightly coupled to your backend vendor (Datadog, Honeycomb, New Relic). If you want to switch vendors, scrub PII, or reduce data volume, you have to redeploy every single service.

With a Collector, your applications just send data to “localhost”, and the Collector handles the rest. It is your vendor-agnostic control plane.

In this module, we will deconstruct the Collector’s architecture, build a robust production configuration from scratch, and simulate how data flows through its internal pipeline.

1. Why use a Collector?

You can configure the OTel SDK in your application to send data directly to a backend. This is called the “Direct-to-Vendor” approach, and it is almost always a mistake for production systems.

Feature Direct Export (Bad) With Collector (Good)
Coupling Apps know backend credentials Apps only know “localhost”
Data Control Hard to filter/redact centrally Centralized PII scrubbing
Network Many connections to vendor Batched, compressed, persistent connections
Sampling Head-based only (limited) Tail-based sampling (powerful)
Migration Redeploy all apps to switch vendors Update Collector config only

[!TIP] Pro Tip: Even in development, run a local Collector. It allows you to tee traffic to a local console exporter for debugging without changing your application code.

2. Interactive Pipeline Simulator

Visualize how data flows through the Collector. Toggle “Batching” to see how spans are grouped, and “Filtering” to simulate dropping noise.

Batch Processing (Reduce IO)
Filtering (Drop Noise)
📡
Receiver
In: 0/s
⚙️
Processor
Buffer: 0
📤
Exporter
Out: 0
0
Network Calls
0
Processed
0
Dropped (Filtered)

3. Architecture Deep Dive

The Collector is essentially an ETL pipeline built on three primary components: Receivers, Processors, and Exporters.

OTLP
Jaeger
Prometheus
Memory Limiter
Batch Processor
Attributes Filter
OTLP (Vendor)
Prometheus
Logging