The Container Lifecycle: From Birth to Death

A Docker container is not just a static binary; it is a living entity with a lifecycle. Understanding this lifecycle—specifically how containers are born, how they live, and how they die—is the difference between a resilient production system and one that loses data on every deployment.

1. The State Machine

At any given moment, a container exists in one of several states. This isn’t random; it follows a strict state machine enforced by the Docker Daemon.

stateDiagram-v2
  [*] --> Created: docker create
  Created --> Running: docker start
  Running --> Paused: docker pause
  Paused --> Running: docker unpause
  Running --> Stopped: docker stop / die
  Stopped --> Running: docker start
  Stopped --> [*]: docker rm
  Running --> [*]: docker kill
  1. Created: The layer is assembled, metadata is written, but no process is running.
  2. Running: The kernel has allocated a PID, and the entrypoint process is executing.
  3. Paused: The process is frozen in time using cgroups (CPU quota set to 0). Memory is still held.
  4. Stopped: The process has exited (either successfully with code 0 or crashed). The filesystem changes persist.
  5. Removed: The container’s metadata and read-write layer are deleted.

2. Stopping vs Killing

The most misunderstood concept in Docker is the difference between stopping and killing a container.

The docker stop Protocol (Graceful)

When you run docker stop my-app, Docker performs a polite handshake:

  1. SIGTERM (Signal 15): Docker sends this signal to PID 1 inside the container. This translates to “Please wrap up your work and exit.”
  2. Grace Period: Docker waits (default 10 seconds).
  3. SIGKILL (Signal 9): If the process is still running after the timeout, the kernel forcibly terminates it.

The docker kill Protocol (Brutal)

When you run docker kill my-app (or when the OOM killer strikes):

  1. SIGKILL (Signal 9): Sent immediately.
  2. Result: No cleanup. No database connections closed. No state saved. Data corruption is likely.

[!WARNING] The PID 1 Problem In Linux, PID 1 (init) has special responsibilities, like reaping Zombie Processes. If your container Entrypoint is a shell script (e.g., /bin/sh -c 'java -jar app.jar'), the shell becomes PID 1. Shells often do not forward signals to their children. So when Docker sends SIGTERM to the shell, the shell ignores it, and your app never hears it. Docker waits 10s and then kills everything. Fix: Use exec in your shell scripts: exec java -jar app.jar. This replaces the shell process with your app, making your app PID 1.


3. Interactive: Signal Visualizer

Send signals to a simulated container process and observe the behavior. Note the difference between the “Polite” SIGTERM and the “Brutal” SIGKILL.

Docker CLI
PID 1
My App
Running
STDOUT / STDERR
> Container started. Listening on port 8080...

4. Code Example: Handling Signals

To support graceful shutdown, your application code must explicitly handle SIGTERM.

In Go, we use a channel to listen for os.Interrupt (SIGINT) and syscall.SIGTERM.

package main

import (
    "context"
    "fmt"
    "net/http"
    "os"
    "os/signal"
    "syscall"
    "time"
)

func main() {
    // 1. Create a server
    srv := &http.Server{Addr: ":8080"}

    // 2. Make a channel to listen for signals
    stop := make(chan os.Signal, 1)
    signal.Notify(stop, os.Interrupt, syscall.SIGTERM)

    go func() {
        fmt.Println("Server starting on :8080")
        if err := srv.ListenAndServe(); err != nil && err != http.ErrServerClosed {
            fmt.Printf("Listen error: %v\n", err)
        }
    }()

    // 3. Block until signal received
    <-stop
    fmt.Println("\nShutdown signal received...")

    // 4. Create a deadline to wait for active requests
    ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
    defer cancel()

    // 5. Shutdown gracefully
    if err := srv.Shutdown(ctx); err != nil {
        fmt.Printf("Shutdown error: %v\n", err)
    }
    fmt.Println("Server stopped gracefully")
}

In Java, we use a ShutdownHook. The JVM starts a new thread when it receives a termination signal.

import java.util.concurrent.TimeUnit;

public class GracefulApp {
    public static void main(String[] args) throws InterruptedException {
        System.out.println("App started (PID " + ProcessHandle.current().pid() + ")");

        // 1. Register Shutdown Hook
        Runtime.getRuntime().addShutdownHook(new Thread(() -> {
            System.out.println("\n[JVM] Shutdown initiated...");
            cleanUpResources();
            System.out.println("[JVM] Bye!");
        }));

        // Simulate work
        while (true) {
            Thread.sleep(1000);
            System.out.print(".");
        }
    }

    private static void cleanUpResources() {
        try {
            System.out.println("Closing DB connections...");
            // Simulate delay
            TimeUnit.SECONDS.sleep(2);
            System.out.println("Flushing buffers...");
        } catch (InterruptedException e) {
            Thread.currentThread().interrupt();
        }
    }
}