Volume Management: Taming the Beast
The War Story: Imagine a junior engineer tearing down a misbehaving database container with docker rm -f production_db. Because the database was writing to a container layer instead of an external volume, five years of customer records vanish instantly. Volumes are the external hard drives of the container world—they are persistent, meaning they survive container deletion. But with great persistence comes great responsibility: if unmanaged, your disk will fill up with orphan volumes containing gigabytes of stale data.
1. The Anatomy of a Volume
Under the hood, Docker manages volumes natively. On Linux, they are typically stored at /var/lib/docker/volumes/. Because Docker manages this location, it is easier to back up, migrate, and secure than arbitrary bind mounts.
The “-v” vs “–mount” Debate
Historically, -v (or --volume) was the standard. It consists of three fields separated by colons: -v my-vol:/app/data:ro.
However, --mount is more explicit and verbose, making it preferred for production services:
--mount source=my-vol,target=/app/data,readonly
Backups & Migrations
Because a volume is an isolated directory on the host, backing it up involves spinning up a temporary container that mounts both the volume and a local backup directory:
# Backing up a volume to a tarball
docker run --rm --mount source=my-db-data,target=/data -v $(pwd):/backup ubuntu tar cvf /backup/db_backup.tar /data
2. Advanced: Volume Plugins (NFS & Cloud)
Volumes aren’t restricted to local disk. Docker volume plugins allow containers to directly mount external storage.
- NFS (Network File System): Share data across multiple Docker hosts.
- Cloud Providers: Plugins like rexray/ebs for AWS Elastic Block Store, allowing persistent storage across EC2 instances.
Example of creating an NFS-backed volume:
docker volume create --driver local \
--opt type=nfs \
--opt o=addr=192.168.1.1,rw \
--opt device=:/path/to/dir \
my-nfs-volume
3. The Lifecycle of a Volume
- Creation: Explicit (
docker volume create) or Implicit (viaVOLUMEinstruction in Dockerfile). - Attachment: Mounted into a running container.
- Detachment: Container stops or is removed. The volume remains.
- Pruning: Deleting volumes that are no longer attached to any container.
Named vs Anonymous Volumes
- Named:
docker run -v my-db:/data .... Easy to find, backup, and manage. - Anonymous:
docker run -v /data .... Docker generates a random hash name (e.g.,2d3f4a...). These are the primary cause of “disk leak” because they are easily forgotten.
4. Interactive: Lifecycle Simulator
Visualize the state of a volume as you perform operations.
3. Code Example: Cleaning Up
How to programmatically find and remove unused volumes.
Go
package main
import (
"context"
"fmt"
"github.com/docker/docker/api/types/filters"
"github.com/docker/docker/client"
)
func main() {
ctx := context.Background()
cli, _ := client.NewClientWithOpts(client.FromEnv)
// Prune Volumes
// Equivalent to: docker volume prune -f
report, err := cli.VolumesPrune(ctx, filters.Args{})
if err != nil {
panic(err)
}
fmt.Printf("Deleted Volumes:\n")
for _, v := range report.VolumesDeleted {
fmt.Println(v)
}
fmt.Printf("Space Reclaimed: %d bytes\n", report.SpaceReclaimed)
}
Java
import com.github.dockerjava.api.DockerClient;
import com.github.dockerjava.core.DockerClientBuilder;
import com.github.dockerjava.api.model.PruneResponse;
public class VolumeCleaner {
public static void main(String[] args) {
DockerClient dockerClient = DockerClientBuilder.getInstance().build();
// Prune unused volumes
PruneResponse response = dockerClient.pruneCmd(PruneType.VOLUMES).exec();
System.out.println("Space Reclaimed: " + response.getSpaceReclaimed());
}
}