Scaling and Performance Tuning

The OpenTelemetry Collector is a high-throughput, low-latency Go application. Default settings are rarely enough for production loads exceeding 10,000 spans/second.

This chapter covers Go Runtime Tuning, Pipeline Optimization, and Advanced Sampling.

1. The Processing Pipeline (Visualized)

The Collector pipeline is a series of synchronous or asynchronous steps. Understanding the flow is key to preventing data loss.

Interactive Pipeline Simulator

Adjust the batch size and timeout to see how the buffer fills and when the memory limiter panic-triggers.

Pipeline Dynamics Simulator

1000
1000
Buffering...
Memory Buffer
Batch Frequency
1.0s
Memory Risk
Low

2. Go Runtime Tuning (Advanced)

The Collector is a Go binary. At high throughput, Go’s Garbage Collector (GC) becomes the bottleneck.

The GC Problem

By default, Go triggers a GC cycle whenever the heap size doubles (GOGC=100).

  • Scenario: Collector uses 2GB RAM.
  • Result: GC triggers at 4GB. The OS kills the container because the limit was 3GB.

Solution 1: Memory Ballast

We can allocate a large, unused byte array (ballast) at startup. This tricks the GC into thinking the heap is large, so it triggers less often.

# In your helm chart or deployment
extensions:
  memory_ballast:
  size_mib: 1000 # Allocates 1GB of "fake" memory

Solution 2: GOGC Tuning (Go 1.19+)

Modern Go versions allow setting a soft memory limit.

env:
  - name: GOMEMLIMIT
  value: "2500MiB" # Set to 90% of container limit

This forces the GC to run more aggressively as it approaches the limit, preventing OOM kills without manual ballast hacks.

3. Implementing a Custom Processor (Golang)

Sometimes standard processors aren’t enough. You might need to redact PII using a complex regex or enrich spans with data from a local cache.

Here is a minimal implementation of a Custom Processor in Go.

package customprocessor

import (
	"context"
	"go.opentelemetry.io/collector/component"
	"go.opentelemetry.io/collector/consumer"
	"go.opentelemetry.io/collector/pdata/ptrace"
	"go.uber.org/zap"
)

type customProcessor struct {
	logger *zap.Logger
	next   consumer.Traces
}

// Capabilities defines if we mutate data (yes)
func (p *customProcessor) Capabilities() consumer.Capabilities {
	return consumer.Capabilities{MutatesData: true}
}

// ConsumeTraces is the hot path! Called for every batch.
func (p *customProcessor) ConsumeTraces(ctx context.Context, td ptrace.Traces) error {
	rss := td.ResourceSpans()
	for i := 0; i < rss.Len(); i++ {
		ilss := rss.At(i).ScopeSpans()
		for j := 0; j < ilss.Len(); j++ {
			spans := ilss.At(j).Spans()
			for k := 0; k < spans.Len(); k++ {
				span := spans.At(k)

				// EXAMPLE: Redact "credit_card" attribute
				if _, ok := span.Attributes().Get("credit_card"); ok {
					span.Attributes().PutStr("credit_card", "****")
					p.logger.Debug("Redacted credit card info")
				}
			}
		}
	}
	// Pass to next component in pipeline
	return p.next.ConsumeTraces(ctx, td)
}

// Factory to create the processor
func NewFactory() component.ProcessorFactory {
  return component.NewProcessorFactory(
    "redactor",
    createTraceProcessor,
    component.WithTracesStability(component.StabilityLevelAlpha),
  )
}

4. Tail Sampling: The “Keep Only What Matters” Strategy

Tail sampling allows you to keep 100% of errors but only 1% of success traces.

Tail Sampling Visualizer

Configure policies to see which traces are kept or dropped.

Sampling Decision Logic

1. Error Policy
2. Latency Policy
3. Random Policy
Kept Traces: 0 / Total: 0

5. Summary

  • Tuning GOGC or using GOMEMLIMIT is mandatory for high-memory production environments to avoid OOM kills.
  • Custom Processors in Go allow you to extend the Collector’s capabilities beyond standard YAML configuration.
  • Tail Sampling is the most powerful tool for reducing costs while maintaining observability fidelity, but it requires careful memory planning.