核心 · Key Idea
In one line: each Dockerfile instruction is a layer, and layers are cacheable and append-only. Master layers + cache + multi-stage and you'll cut images from 1.2 GB down to 80 MB.
A good Dockerfile (Node example)#
# ---- builder ----
FROM node:20-alpine AS builder
WORKDIR /app
COPY package.json pnpm-lock.yaml ./
RUN corepack enable && pnpm install --frozen-lockfile
COPY . .
RUN pnpm build
# ---- runtime ----
FROM node:20-alpine
WORKDIR /app
ENV NODE_ENV=production
COPY --from=builder /app/.next ./.next
COPY --from=builder /app/public ./public
COPY --from=builder /app/package.json ./
COPY --from=builder /app/node_modules ./node_modules
EXPOSE 3000
USER node
CMD ["node", "node_modules/next/dist/bin/next", "start"]Build:
docker build -t myapp:0.1 .
docker build --platform linux/amd64,linux/arm64 -t myapp:0.1 . --push # buildxAnalogy#
打个比方 · Analogy
An image is a stack of transparencies — each layer overlays the next, and you see the top result (the container's filesystem). Change a lower layer and everything above gets repainted → cache invalidated; change only an upper layer (like code) and lower layers (deps) stay → rebuild in seconds.
Key concepts#
LayerLayer
Each RUN / COPY / ADD creates a layer. The magic is that layers are cacheable.
Cache keyCache Key
Determined by this instruction + checksum of layers above. That's why **COPY package.json then install** beats a single COPY.
Multi-stageMulti-stage
`FROM ... AS builder` + final stage only COPYs artifacts → toolchain dropped, image stays small.
.dockerignore.dockerignore
node_modules, .git, build outputs — keep them out of the builder.
BuildKit / buildxModern build backend
Parallel builds, cross-arch, cache mounts. Always on in production.
Distroless / scratchMinimal base
Just the binary + needed libs, no shell. **Secure + small**, but harder to debug.
How it works#
Each layer has its own sha256; pushing only sends new layers.
Practical notes#
- Order: least-changing on top (base image, deps), most-changing at bottom (source code).
COPYoverADD— ADD has auto-extract / URL-download magic that's easy to misuse.- Fewer
RUNs: each RUN is a layer. Chain multiple shell commands with&&. - Only deletion makes things small:
apt-get install -y X && apt-get clean && rm -rf /var/lib/apt/lists/*— without the cleanup, the layer keeps the package files. HEALTHCHECKin the Dockerfile lets platforms (Compose / K8s) consume health automatically.ENV NODE_ENV=production+--frozen-lockfilefor reproducible builds.- Skip
latest: in CI tag:1.2.3+:1.2+:1so rollback is easy. - Multi-arch:
docker buildx create --use, then--platform linux/amd64,linux/arm64.
Common anti-patterns#
- Bare
node:20≈ 1 GB → usenode:20-alpineornode:20-slim. - Build-time gcc / make leaks into runtime → use multi-stage.
- Reinstall deps on every code change → COPY lockfile first, then install.
- Secrets baked into image → BuildKit
--mount=type=secret, never bake them into a layer.
Easy confusions#
docker build
Classic single-arch build.
No parallelism, simple cache.
No parallelism, simple cache.
docker buildx
BuildKit-powered, **multi-platform + remote cache**.
The current default.
The current default.