Control Groups (Cgroups)

[!NOTE] This module explores the core principles of Control Groups (Cgroups), deriving solutions from first principles and hardware constraints to build world-class, production-ready expertise.

1. The Illusion of Resource Ownership

Namespaces hide processes from each other. But if one process consumes 100% of the CPU, the other processes (even if hidden) will starve.

Control Groups (Cgroups) solve this by limiting, accounting, and isolating the resource usage (CPU, Memory, Disk I/O, Network) of a collection of processes.

Cgroups v1 vs v2

  • Cgroups v1: Hierarchies were separate. You had /sys/fs/cgroup/cpu and /sys/fs/cgroup/memory as distinct trees. A process could be in Group A for CPU but Group B for Memory. This was complex and buggy.
  • Cgroups v2: A single unified hierarchy. A process belongs to one cgroup, and controllers (cpu, memory, io) are enabled for that cgroup.

[!TIP] Check your version: Run stat -fc %T /sys/fs/cgroup/. If it says cgroup2fs, you are on v2 (modern Linux). If tmpfs, check inside for cgroup directories (v1).

2. CPU Limiting: Quota and Period

The most common limit is CPU. It uses a CFS (Completely Fair Scheduler) bandwidth control mechanism.

  1. Period (cpu.cfs_period_us): The window of time (usually 100ms or 100,000µs).
  2. Quota (cpu.cfs_quota_us): How much of that window the process can use.

If Quota = 50,000µs and Period = 100,000µs, the process gets 0.5 CPU.


3. Interactive: Cgroup Quota Simulator

Adjust the sliders to see how the kernel throttles a process.

10ms (0.1 CPU) 50ms (0.5 CPU) 100ms (1.0 CPU)
Status:
Running
Execution Timeline (1 Second)
Running
Throttled

4. Code Examples

How does Docker actually limit resources? It writes to files in /sys/fs/cgroup.

1. Go Implementation (Controller)

This code simulates what a Container Runtime (like runc) does to limit a container’s CPU.

package main

import (
	"fmt"
	"os"
	"path/filepath"
	"strconv"
)

func main() {
	// 1. Define Cgroup Path (cgroup v2)
	cgroupPath := "/sys/fs/cgroup/my-container"

	// 2. Create the Cgroup directory
	if err := os.MkdirAll(cgroupPath, 0755); err != nil {
		panic(err)
	}

	// 3. Set CPU Limit (Quota 50ms / Period 100ms)
	// Format: "$MAX $PERIOD"
	limit := "50000 100000"
	if err := os.WriteFile(filepath.Join(cgroupPath, "cpu.max"), []byte(limit), 0644); err != nil {
		panic(err)
	}

	// 4. Add Current Process to Cgroup
	pid := strconv.Itoa(os.Getpid())
	if err := os.WriteFile(filepath.Join(cgroupPath, "cgroup.procs"), []byte(pid), 0644); err != nil {
		panic(err)
	}

	fmt.Println("I am now limited to 0.5 CPU!")
	// Run CPU intensive work...
}

2. Java Implementation (Observability)

Java applications need to know if they are running inside a container to size thread pools correctly. Before Java 10, Runtime.getRuntime().availableProcessors() returned the Host CPU count, causing massive performance issues (too many threads created).

Modern Java (10+ or 8u191+) is “Container Aware”.

import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.List;

public class CgroupObserver {

    public static void main(String[] args) {
        // 1. Check CPU Count (Should respect Quota)
        int processors = Runtime.getRuntime().availableProcessors();
        System.out.println("JVM sees CPU count: " + processors);

        // 2. Manually Check Memory Limit (Linux Cgroup v2)
        try {
            // Read memory.max from cgroup filesystem
            // Note: In a real app, path might vary based on mount point
            String memMax = new String(Files.readAllBytes(Paths.get("/sys/fs/cgroup/memory.max"))).trim();

            if (memMax.equals("max")) {
                System.out.println("Memory Limit: Unbounded");
            } else {
                long bytes = Long.parseLong(memMax);
                System.out.printf("Memory Limit: %d MB\n", bytes / 1024 / 1024);
            }
        } catch (IOException e) {
            System.out.println("Could not read cgroup limits (Are we in a container?)");
        }
    }
}

[!WARNING] The OOM Killer: If a cgroup exceeds its memory limit, the Linux Kernel invokes the Out Of Memory (OOM) Killer. It ruthlessly kills the process with the highest score (usually your database or app). This is not a graceful shutdown (SIGTERM); it is a SIGKILL.

5. First Principles: Why Cgroups?

Without Cgroups, a single runaway process (“fork bomb” or memory leak) could crash the entire physical server. Namespaces provide privacy, but Cgroups provide protection.

In Kubernetes, Cgroups are the enforcement mechanism for resources.requests and resources.limits.