The Collector: Architecture & Deployment

The OpenTelemetry Collector is the single most critical component in a production observability stack. While the OTel SDKs generate data, the Collector is what makes that data manageable at scale.

[!IMPORTANT] Why do you need this? Without a Collector, every microservice in your fleet is tightly coupled to your backend vendor (Datadog, Honeycomb, New Relic). If you want to switch vendors, scrub PII, or reduce data volume, you have to redeploy every single service.

With a Collector, your applications just send data to “localhost”, and the Collector handles the rest. It is your vendor-agnostic control plane.

In this module, we will deconstruct the Collector’s architecture, build a robust production configuration from scratch, and simulate how data flows through its internal pipeline.

1. Why use a Collector?

You can configure the OTel SDK in your application to send data directly to a backend. This is called the “Direct-to-Vendor” approach, and it is almost always a mistake for production systems.

Feature	Direct Export (Bad)	With Collector (Good)
Coupling	Apps know backend credentials	Apps only know “localhost”
Data Control	Hard to filter/redact centrally	Centralized PII scrubbing
Network	Many connections to vendor	Batched, compressed, persistent connections
Sampling	Head-based only (limited)	Tail-based sampling (powerful)
Migration	Redeploy all apps to switch vendors	Update Collector config only

[!TIP] Pro Tip: Even in development, run a local Collector. It allows you to tee traffic to a local console exporter for debugging without changing your application code.

2. Interactive Pipeline Simulator

Visualize how data flows through the Collector. Toggle “Batching” to see how spans are grouped, and “Filtering” to simulate dropping noise.

Batch Processing (Reduce IO)

Filtering (Drop Noise)

📡

Receiver

In: 0/s

⚙️

Processor

Buffer: 0

📤

Exporter

Out: 0

Network Calls

Processed

Dropped (Filtered)

3. Architecture Deep Dive

The Collector is essentially an ETL pipeline built on three primary components: Receivers, Processors, and Exporters.

Input (Receivers)

OTLP

Jaeger

Prometheus

→

The Collector Pipeline

Memory Limiter

↓

Batch Processor

↓

Attributes Filter

→

Output (Exporters)

OTLP (Vendor)

Prometheus

Logging