Logging: The EFK Stack

In a traditional VM environment, logs are written to /var/log/app.log and rotate daily. In Kubernetes, Pods are ephemeral. If a Pod crashes and is restarted, its logs are lost forever unless you ship them somewhere else.

The EFK Stack is the standard open-source solution for log centralization.

Scenario: The Black Friday Outage Imagine your e-commerce platform crashes during Black Friday. You try to kubectl logs the checkout Pod, but it has already restarted. The original logs showing the error are lost forever. To prevent this, you must ship logs off the ephemeral node.

  • Elasticsearch: Stores and indexes logs.
  • Fluentd (or Fluent Bit): Collects and ships logs.
  • Kibana: Visualizes logs.

1. The Architecture: Node-Level Logging

The most common pattern is DaemonSet Logging.

  1. Application: Writes logs to stdout / stderr.
  2. Docker/Containerd: Captures these streams and writes to /var/log/containers/*.log on the Node.
  3. Fluentd: Runs as a DaemonSet (one per Node), tails these files, parses them, and sends them to Elasticsearch.

2. Structured Logging (JSON)

If your app logs plain text: 2023-10-27 10:00:00 INFO User logged in id=123

Elasticsearch treats this as a single string. Searching for user_id: 123 requires full-text search, which is slow and resource-intensive.

Pitfall: The Mapping Explosion While JSON is great, beware of dynamic keys. If your application logs dynamic user IDs as keys (e.g., {"user_123_status": "active"} instead of {"user_id": "123", "status": "active"}), Elasticsearch will create a new mapping index for every user. This leads to a Mapping Explosion, eventually crashing your Elasticsearch cluster due to JVM heap exhaustion. Always use static, predictable keys.

If your app logs JSON: {"timestamp": "2023-10-27T10:00:00Z", "level": "INFO", "message": "User logged in", "user_id": 123}

Elasticsearch indexes user_id as a field. You can now filter, aggregate, and visualize efficiently.

3. Interactive: The Log Pipeline

Visualize how a raw log line is transformed by Fluentd filters into a structured document.

1. Raw Log (stdout)

[2023-10-27 14:00:00] [INFO] Request processed in 45ms

2. Fluentd Parser

/^\[(?<time>[^\]]*)\] \[(?<level>[^\]]*)\] (?<msg>.*)$/

3. Elasticsearch Doc

(Waiting for parser...)

4. Structured Logging Code Examples

Ideally, your application should output JSON directly, skipping the complex regex parsing step.

Go

import "go.uber.org/zap"

func main() {
  logger, _ := zap.NewProduction()
  defer logger.Sync()

  // Key-Value pairs become JSON fields
  logger.Info("failed to fetch URL",
    zap.String("url", "http://example.com"),
    zap.Int("attempt", 3),
    zap.Duration("backoff", time.Second),
  )
}
// Output: {"level":"info","ts":159493,"msg":"failed to fetch URL","url":"http://example.com","attempt":3,"backoff":1}

Java

<configuration>
  <appender name="CONSOLE" class="ch.qos.logback.core.ConsoleAppender">
    <encoder class="net.logstash.logback.encoder.LogstashEncoder"/>
  </appender>
</configuration>
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import static net.logstash.logback.argument.StructuredArguments.kv;

public class MyService {
  private static final Logger logger = LoggerFactory.getLogger(MyService.class);

  public void process(String userId) {
    logger.info("Processing user", kv("user_id", userId), kv("status", "active"));
  }
}
// Output: {"@timestamp":"...","message":"Processing user","user_id":"123","status":"active",...}

5. DaemonSet vs. Sidecar

Strategy DaemonSet Sidecar
Concept One agent per Node One agent per Pod
Resource Usage Low (Shared) High (Duplicated)
Complexity Simple (Standard) Complex (Manifest changes)
Use Case Standard stdout logs Legacy apps writing to files on disk

6. Summary

  • Centralize: Never rely on kubectl logs. Ship them to Elasticsearch.
  • Structure: Log in JSON to make logs queryable.
  • DaemonSet: Use Fluentd/Fluent Bit as a DaemonSet to collect logs efficiently from all pods on a node.