The Universal Adapter
[!NOTE] This module explores the core principles of CSI Drivers, deriving solutions from first principles and hardware constraints to build world-class expertise.
The Problem: Imagine a master carpenter whose toolbelt has an unchangeable set of tools physically fused to it. If a new type of screw hits the market, the carpenter has to buy an entirely new, custom-made toolbelt.
In the early days, Kubernetes operated exactly like this. The storage code for AWS EBS, GCE PD, and others was hardcoded directly into its core binary ("in-tree"). Every time a storage vendor wanted to release a bug fix or add a new feature, they had to wait for a full Kubernetes release cycle. This bloated the core binary and created a massive maintenance nightmare.
The Solution: The Container Storage Interface (CSI) is the ultimate universal adapter. It is an industry standard (also used by Docker Swarm and Mesos) that allows any storage vendor to write a plugin that Kubernetes can talk to over a standardized gRPC API. This moves storage code out of the Kubernetes repository ("out-of-tree").
1. Architecture: The Anatomy of a CSI Driver
A production CSI Driver is not just a single binary. It is architected as two distinct workloads running in your cluster, separating cluster-wide orchestration from node-local execution.
The Controller Plugin (The Brain)
Usually deployed as a highly available StatefulSet or Deployment on the Control Plane.
- Responsibility: Communicates with the external Cloud Provider’s Control Plane API.
- Actions: It provisions the actual hardware or software disk. When you create a
PersistentVolumeClaim(PVC), the Controller Plugin calls the AWS API (for example) to create an actual EBS volume. It is also responsible for attaching that volume to the specific EC2 instance (Node) where the Pod was scheduled. - Scale: You only need one active Controller Plugin per cluster.
The Node Plugin (The Brawn)
Deployed as a DaemonSet, meaning a copy runs on every single Node in your cluster.
- Responsibility: Interacts with the underlying host OS (Linux/Windows) on the Node.
- Actions: Once the Controller has attached the raw block device to the instance (e.g.,
/dev/nvme1n1), the Node Plugin formats the disk with a filesystem (ext4, xfs) and bind-mounts it into thekubeletdirectory structure so the container can access it. - Scale: Runs everywhere. If a node goes down, its Node Plugin goes down with it.
2. Interactive: CSI Workflow Visualizer
Follow the journey of a Pod request triggering a CSI Driver operation.
3. The gRPC Protocol Deep Dive
CSI is essentially a collection of gRPC endpoints that the storage vendor must implement. Kubernetes (specifically the kubelet and the external provisioner sidecars) acts as the gRPC client, calling out over a Unix Domain Socket to the CSI driver.
The specification defines three main services:
- Identity Service: Allows K8s to identify the driver name and version.
- Controller Service: Functions like
CreateVolume,DeleteVolume,ControllerPublishVolume(Attach), andControllerUnpublishVolume(Detach). - Node Service: Handled by the Node Plugin DaemonSet.
The Lifecycle of a Mount Request on the Node
When the kubelet needs to present the disk to a container, it triggers the Node Service in two distinct phases:
Phase 1: NodeStageVolume
Imagine you are building an apartment building. NodeStageVolume is the process of bringing the main water pipe from the city street into the basement of the building.
- What it does: It takes the raw attached block device (e.g.,
/dev/xvdf) and mounts it to a global directory on the host machine. - Why a global directory? If you have multiple Pods on the same Node that need access to the same ReadWriteMany (RWX) volume (like NFS), you only want to mount the network share to the Node once.
Phase 2: NodePublishVolume
Now that the water is in the basement, NodePublishVolume runs pipes up to the individual apartments (the Pods).
- What it does: It performs a Linux bind mount from the global staging directory to the specific container’s filesystem sandbox directory (
/var/lib/kubelet/pods/<pod-uuid>/volumes/...). - Why split it? This two-step process allows for efficient sharing of volumes across multiple Pods on the same physical server. When the Pod dies,
NodeUnpublishVolumesimply breaks the bind mount.NodeUnstageVolumeonly runs when the last Pod using that volume is deleted from the Node.
4. Installing a CSI Driver (Real-World Example)
Kubernetes provides standard “Sidecar Containers” (like external-provisioner and external-attacher). When you install a CSI driver, the vendor bundles their custom binary alongside these standard K8s sidecars within the same Pod.
For example, installing the AWS EBS CSI Driver via Helm:
helm repo add aws-ebs-csi-driver https://kubernetes-sigs.github.io/aws-ebs-csi-driver
helm install aws-ebs-csi-driver aws-ebs-csi-driver/aws-ebs-csi-driver \
--namespace kube-system
If we inspect the Controller Pod, you will see multiple containers running together (6/6 READY):
kubectl get pods -n kube-system -l app.kubernetes.io/name=aws-ebs-csi-driver
# NAME READY STATUS RESTARTS
# ebs-csi-controller-56f77c8756-abcde 6/6 Running 0 (Controller)
# ebs-csi-node-xyzwq 3/3 Running 0 (Node Plugin)
The 6 containers in the Controller typically are:
- ebs-plugin: The actual vendor code that knows AWS APIs.
- csi-provisioner: (K8s Sidecar) Watches for PVCs and calls
CreateVolumeon the ebs-plugin. - csi-attacher: (K8s Sidecar) Watches for VolumeAttachments and calls
ControllerPublishVolume. - csi-resizer: (K8s Sidecar) Handles volume expansion.
- csi-snapshotter: (K8s Sidecar) Handles CSI snapshots.
- livenessprobe: Monitors the gRPC socket to ensure the driver is alive.
This sidecar architecture is brilliant because the Kubernetes team maintains the generic logic (watching the K8s API, retries, backoffs), while the vendor only has to maintain the code that talks to their specific storage hardware.