Dockerfile Instructions

[!NOTE] This module explores the core principles of Dockerfile Instructions, deriving solutions from first principles and hardware constraints to build world-class, production-ready expertise.

1. The Blueprint of a Container

A Dockerfile is a text document containing all the commands a user could call on the command line to assemble an image. Think of it as a recipe where order matters.

Core Instructions

Instruction Purpose Example
FROM Initializes a new build stage and sets the Base Image. FROM ubuntu:22.04
WORKDIR Sets the working directory for subsequent instructions. Like cd. WORKDIR /app
COPY Copies files from your host to the image filesystem. COPY . .
RUN Executes commands in a new layer (e.g., install packages). RUN npm install
CMD Provides defaults for an executing container. CMD ["node", "app.js"]

2. The Great Debates

1. CMD vs ENTRYPOINT

This is the #1 source of confusion. Both define what runs when the container starts.

  • CMD: “The Default Command”. Can be easily overridden by arguments passed to docker run.
  • ENTRYPOINT: “The Main Executable”. Difficult to override. Arguments passed to docker run are appended to it.

Best Practice: Use ENTRYPOINT for the binary/script and CMD for default arguments.

Shell vs Exec Form

Always use the Exec Form (JSON array) to avoid PID 1 signal handling issues.

  • CMD ["npm", "start"] (Exec Form - No shell, signal goes to npm)
  • CMD npm start (Shell Form - Wrapped in /bin/sh -c, signals swallowed)

Interactive: CMD vs ENTRYPOINT Simulator

Try typing: world

Process Executed Inside Container:
/bin/echo hello
Default CMD is used.

2. COPY vs ADD

  • COPY: Copies local files. Simple. Explicit.
  • ADD: Can extract local tar archives automatically and download files from URLs.

[!TIP] Rule of Thumb: Always use COPY unless you explicitly need ADD’s tar extraction magic. Using ADD to download URLs is generally discouraged (use curl/wget in a RUN step instead to keep layers clean).

3. ENV vs ARG

  • ARG: Variables available only during the build process. They disappear in the running container.
  • ENV: Variables available during the build AND persist in the running container.

Use ARG for build versions (e.g., ARG GO_VERSION=1.21). Use ENV for application configuration (e.g., ENV PORT=8080).


3. Security & Cleanup

USER

By default, containers run as root. This is a security risk. If an attacker breaks out of the app, they have root on the host (if namespaces fail). Always switch to a non-root user.

# Create user and group
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
# Switch context
USER appuser

EXPOSE

This instruction does not actually publish the port. It functions as documentation for the human reading the Dockerfile. You still need -p 8080:8080 in docker run.

VOLUME

Creates a mount point.

  • Pros: Explicitly declares where persistent data lives.
  • Cons: Can cause issues in CI/CD pipelines if downstream systems don’t expect it (data becomes “sticky” or anonymous volumes proliferate). Use with care.

4. Summary Checklist

  1. Use specific Base Images (ubuntu:22.04, not ubuntu:latest).
  2. Prefer COPY over ADD.
  3. Use ENTRYPOINT for the binary + CMD for arguments.
  4. Always define a WORKDIR.
  5. Switch to a non-root USER at the end.