Module Review: The Collector

Key Takeaways

  • Decoupling: The Collector decouples your applications from backend vendors, allowing you to switch vendors without redeploying apps.
  • ETL Pipeline: The Collector is an ETL pipeline: Receivers (Input) → Processors (Transform) → Exporters (Output).
  • Order Matters: Processors run sequentially. Always put the Memory Limiter first to prevent crashes, followed by Batching for performance.
  • Deployment: Use DaemonSets (Agents) for logs/host metrics and Deployments (Gateways) for traces/sampling.
  • Sampling: Use Head-based sampling for simple volume reduction and Tail-based sampling to keep critical traces (errors/latency) while dropping noise.

Interactive Flashcards

What are the 3 main components of a Collector?

Tap to flip

Receivers, Processors, Exporters

Receivers ingest data, Processors transform it, and Exporters send it to backends.

Why must the Memory Limiter be the first processor?

Tap to flip

To prevent OOM crashes

It monitors memory usage and drops data if it exceeds safe limits, protecting the process.

What is the difference between Agent and Gateway deployment?

Tap to flip

Scope & Scale

Agent runs on every node (DaemonSet) for local collection. Gateway runs as a centralized cluster (Deployment) for processing and sampling.

What is Tail-Based Sampling?

Tap to flip

Post-Trace Decision

The Collector buffers the entire trace and makes a sampling decision based on the complete data (e.g., "Keep all errors").

Collector Cheat Sheet

Component Purpose Key Configuration
OTLP Receiver Ingests data via gRPC/HTTP protocols: grpc: endpoint: 0.0.0.0:4317
Memory Limiter Prevents Out-Of-Memory crashes limit_mib: 1024, spike_limit_mib: 256
Batch Processor Batches data to reduce network IO send_batch_size: 1024, timeout: 10s
Resource Processor Adds metadata (env, cluster) attributes: [{key: env, value: prod, action: insert}]
Attributes Processor Filters or modifies attributes actions: [{key: user.id, action: delete}]
OTLP Exporter Sends to vendor/collector endpoint: "api.vendor.com:4317", headers: {api-key: ...}
Logging Exporter Debugging (prints to stdout) verbosity: detailed

Next Steps

Now that you’ve mastered the Collector, it’s time to learn how to intelligently reduce your data volume without losing visibility.

Start Module 08: Sampling Strategies

[!NOTE] Need a refresher on terms? Check the OpenTelemetry Glossary.