Linux Namespaces
[!NOTE] This module explores the core principles of Linux Namespaces, deriving solutions from first principles and hardware constraints to build world-class, production-ready expertise.
1. The Illusion of Isolation
To the Linux Kernel, a container is not a real object. It is an illusion created by two kernel features: Namespaces (what you see) and Cgroups (what you use).
Namespaces partition kernel resources such that one set of processes sees one set of resources while another set of processes sees a different set of resources.
The Kernel View: nsproxy
Every process in Linux is represented by a task_struct. This struct contains a pointer to nsproxy, which holds pointers to the specific namespaces the process belongs to.
graph LR
T1[task_struct (PID 123)] --> N1[nsproxy]
T2[task_struct (PID 456)] --> N1
T3[task_struct (PID 789)] --> N2[nsproxy (Container)]
N1 --> M1[mnt_ns (Host)]
N1 --> P1[pid_ns (Host)]
N1 --> U1[uts_ns (Host)]
N2 --> M2[mnt_ns (Container)]
N2 --> P2[pid_ns (Container)]
N2 --> U2[uts_ns (Container)]
style T1 fill:var(--bg-card),stroke:var(--accent-main)
style T2 fill:var(--bg-card),stroke:var(--accent-main)
style T3 fill:var(--bg-card),stroke:var(--green-500)
style N1 fill:var(--bg-soft),stroke:var(--text-muted)
style N2 fill:var(--bg-soft),stroke:var(--text-muted)
When you run docker run, Docker simply asks the kernel to create a new process with new pointers in nsproxy.
2. The 7 Namespaces
| Namespace | Constant | Isolates |
|---|---|---|
| PID | CLONE_NEWPID |
Process IDs (PID 1 inside container) |
| NET | CLONE_NEWNET |
Network devices, stacks, ports (own localhost) |
| MNT | CLONE_NEWNS |
Mount points (own / filesystem) |
| UTS | CLONE_NEWUTS |
Hostname and NIS domain name |
| IPC | CLONE_NEWIPC |
System V IPC, POSIX message queues |
| USER | CLONE_NEWUSER |
User and Group IDs (Root inside, Nobody outside) |
| CGROUP | CLONE_NEWCGROUP |
Cgroup root directory view |
3. Interactive: Namespace Explorer
Visualize how toggling namespaces changes the process’s view of the system.
4. Code Examples
1. Go Implementation (System Programming)
In Go, we use syscall.SysProcAttr to request new namespaces when cloning a process. This is exactly what runc does under the hood.
package main
import (
"fmt"
"os"
"os/exec"
"syscall"
)
func main() {
// Re-run this binary with "child" argument to act as the container process
if len(os.Args) > 1 && os.Args[1] == "child" {
runChild()
return
}
cmd := exec.Command("/proc/self/exe", "child")
cmd.Stdin = os.Stdin
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
// REQUEST NEW NAMESPACES
// CLONE_NEWUTS: New Hostname Namespace
// CLONE_NEWPID: New PID Namespace
// CLONE_NEWNS: New Mount Namespace
cmd.SysProcAttr = &syscall.SysProcAttr{
Cloneflags: syscall.CLONE_NEWUTS | syscall.CLONE_NEWPID | syscall.CLONE_NEWNS,
}
if err := cmd.Run(); err != nil {
fmt.Printf("Error running child: %v\n", err)
os.Exit(1)
}
}
func runChild() {
// Set a new hostname visible ONLY in this namespace
syscall.Sethostname([]byte("container-demo"))
// Verify we are PID 1
fmt.Printf("Running inside container as PID %d\n", os.Getpid())
cmd := exec.Command("/bin/bash")
cmd.Stdin = os.Stdin
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
cmd.Run()
}
2. Java Implementation (JVM Wrapper)
Java sits on top of the JVM, which sits on top of the OS. Java does not have direct access to clone(2) flags. To create a namespace from Java, we must invoke an external tool like unshare via ProcessBuilder.
This demonstrates the “Systems Programming Gap” in Java compared to Go or C.
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.util.ArrayList;
import java.util.List;
public class NamespaceDemo {
public static void main(String[] args) throws Exception {
// We cannot call clone() directly.
// Instead, we wrap our command with 'unshare'.
// unshare -u -p -f --mount-proc /bin/bash
List<String> command = new ArrayList<>();
command.add("sudo"); // Namespaces require CAP_SYS_ADMIN
command.add("unshare");
command.add("--uts"); // -u: UTS namespace
command.add("--pid"); // -p: PID namespace
command.add("--fork"); // -f: Fork new process
command.add("--mount-proc"); // Mount /proc for PID visibility
command.add("bash"); // The command to run inside
ProcessBuilder pb = new ProcessBuilder(command);
pb.inheritIO(); // Connect to our terminal
System.out.println("Starting shell in new Namespace...");
Process p = pb.start();
int exitCode = p.waitFor();
System.out.println("Container exited with code: " + exitCode);
}
}
[!NOTE] Why Go? Docker was written in Go precisely because Go allows easy access to low-level Linux syscalls (
syscallpackage) while maintaining high-level productivity. Java requires JNI (Java Native Interface) or external commands to achieve the same isolation.
5. First Principles: Why do we need nsproxy?
Why couldn’t Linux just add a “Container ID” to the process struct?
- Flexibility: Some processes need to share Network but hide PIDs (e.g., Kubernetes Pods). By having separate pointers for each namespace type in
nsproxy, the kernel allows mixing and matching (the “sidecar” pattern). - Legacy Compatibility: Applications don’t need to be rewritten. A web server just binds to port 80. It doesn’t know (or care) that “port 80” is virtualized inside a NET namespace.
This design decision enabled the entire container ecosystem to support existing software without modification.