CI/CD & GitOps Practical By Samson Tanimawo, PhD Published Feb 3, 2026 4 min read

Docker Build Optimization

Slow builds eat developer time. Optimize.

Layer caching

Docker builds are slow by default and fast with care. The difference is whether the Dockerfile is structured to take advantage of the layer cache. Each instruction in a Dockerfile produces a cached layer; subsequent builds reuse the cache up to the first changed instruction; everything after the change is rebuilt. Ordering the instructions correctly is the single most consequential optimization.

What layer-aware Dockerfile ordering looks like:

Stable layers first.: Place instructions that change rarely (FROM, base OS packages, language runtime) at the top. Place instructions that change frequently (application source code, version numbers) at the bottom. The cache is invalidated only at the first change; everything before the change reuses cache.
Order Dockerfile for cache hits.: Copy package manifests (package.json, requirements.txt, go.mod, Cargo.toml) before installing dependencies, which is itself before copying the rest of the source. Application code changes constantly; manifests change weekly; dependencies change less often. The ordering matches the change frequency.
Don't COPY the whole repo at the top.: The most common mistake: COPY . . at the start of the Dockerfile. Every code change invalidates the cache for everything below it, including the dependency install. The whole subsequent pipeline rebuilds. Move the COPY closer to the end.
Use .dockerignore aggressively.: Files that should not enter the image (node_modules, .git, test artifacts, logs) are listed in .dockerignore. Smaller build context means faster transfer to the daemon and smaller cache invalidation surface.
Cache mounts in BuildKit.: The latest Docker (BuildKit) supports cache mounts that persist across builds for things like /root/.cache/pip or ~/.npm. The dependency install reuses the cache from the previous build; reinstall is much faster than fresh. This is a significant speedup for non-trivial builds.

Layer-aware Dockerfile structure is the cheapest optimization with the largest impact. Most teams cut their build times by 50%+ with no other changes once they restructure for cache hits.

Multi-stage

Multi-stage builds are the second optimization that pays back massively. The pattern is: build the application in a stage with the full toolchain (build tools, dev dependencies, debugging utilities), then copy only the runtime artifacts into a final, minimal image. The image that ships to production is a fraction of the size of the build environment.

Build in fat image.: The first stage uses an image with the full toolchain: compilers, build tools, package managers, dev dependencies. The image is large but it is only used during the build. It does not ship anywhere; it produces artifacts.
Copy to slim runtime.: The final stage starts from a minimal base (Alpine, distroless, scratch). It copies the artifacts produced by the build stage and nothing else. The result is an image with only what is needed to run, not what was needed to build.
Smaller final images.: Multi-stage Go binaries can produce 10MB final images from 800MB build environments. Java multi-stage produces 100MB images instead of 600MB. The size reduction is real and pays back at every push, pull, and deploy.
Faster deploys.: Smaller images push and pull faster. Container starts are faster (less data to load). Auto-scaling responds faster (new pods come up sooner). Each step in the deploy pipeline saves seconds; the cumulative effect is large.
Reduced attack surface.: A runtime image without the toolchain has fewer binaries an attacker could exploit. No compilers, no shells (in distroless), no package managers. The image is harder to compromise after a breach because there is nothing useful inside it.

Multi-stage builds are the standard pattern for production Docker images. Single-stage Dockerfiles are usually a sign that the team has not yet adopted modern practices.

BuildKit

BuildKit is Docker's modern build engine. It replaces the legacy build process with parallel execution, more aggressive caching, and a richer instruction set. As of recent Docker versions, BuildKit is the default; teams running older Docker should migrate.

Parallel build stages.: Multi-stage builds where stages do not depend on each other run in parallel. A build that previously took 8 minutes serially might take 4 minutes in parallel. The speedup is automatic; it does not require Dockerfile changes.
Cache mount support.: BuildKit supports cache mounts (RUN --mount=type=cache) that persist across builds without becoming part of the image. Package manager caches, build artifact caches, anything that benefits from being warm. The cache speeds up builds without bloating images.
Secret mount support.: RUN --mount=type=secret lets the build read a secret without it ending up in any image layer. Build-time credentials (private package registry tokens, signing keys) are used during the build but do not leak into the final image.
Better build output.: BuildKit produces clearer logs, separate per-stage output, and better progress reporting. Debugging build issues is materially easier than with the legacy engine.
Default in modern Docker.: Docker Desktop and recent Docker Engine releases use BuildKit by default. CI systems mostly support it natively. Teams running older Docker should upgrade; the migration is mostly free and the benefits are substantial.

Layer-aware Dockerfiles, multi-stage builds, and BuildKit together produce Docker builds that are fast, small, and secure. Nova AI Ops watches build duration as a first-class metric per service, surfaces the cases where Dockerfile structure is causing avoidable rebuilds, and tracks the image-size and build-time trajectory so the team can see the optimization investments paying off.