Container Image Hardening: The 80/20 of Production Dockerfiles

Most CVEs in your container image are from packages your application never imports. Seven Dockerfile changes drop your attack surface by 80%; the rest is diminishing returns.

Why most container CVEs are noise

Run Trivy against a default node:20 image and you will get hundreds of CVEs. Almost none are reachable by your application. The CVE list includes every package the base image ships, regardless of whether your code calls it. Vulnerability triage that does not account for reachability is just inflating ticket queues.

The mental model. The image is your library. The CVEs that matter are the ones in libraries your application actually loads at runtime. The other 95% are decoration. Reduce the library; reduce the surface.

The seven 80/20 changes

1. Pin the base image to a digest, not a tag. FROM node:20-alpine@sha256:... instead of FROM node:20-alpine. Tags move; digests do not. Reproducible builds, predictable CVEs.

2. Use a slim or alpine variant. The default node:20 ships hundreds of system packages your app never touches. node:20-slim drops most of them. node:20-alpine drops more. Each variant is a 70-90% reduction.

3. Multi-stage builds for compiled languages. One stage to compile (full toolchain); a second stage with only the binary and runtime. Final image has zero compiler tools sitting around.

4. Run as a non-root user. Most CVE-to-RCE escalations rely on root. USER 1000 blocks them. Two lines of Dockerfile; massive blast-radius reduction.

5. Drop unnecessary capabilities at runtime. Set securityContext.capabilities.drop: [ALL] in your Kubernetes manifest. The container retains only what it needs (often nothing).

6. Read-only root filesystem. readOnlyRootFilesystem: true in the pod spec. Combined with non-root, this kills 90% of in-container persistence techniques.

7. Strip the package manager from the final image. If apt or apk is not present in the running container, an attacker who lands a shell cannot install tooling. Multi-stage builds get this for free.

The distroless tradeoff

Google's distroless images take the slim philosophy further: no shell, no package manager, no nothing, just the language runtime and your binary. Final image is often 20-30 MB. CVE count drops to single digits.

The cost. No shell means no kubectl exec shell-in for debugging. You get logs only. Many teams find this too punitive in early production; they accept the larger surface for the operational comfort.

The middle ground. Use distroless for batch jobs and stateless services where you would not shell in anyway. Use slim/alpine for services where on-call still expects to drop into a shell. Apply discrimination, not dogma.

Pre-deploy scanning that catches the rest

Trivy or Grype, run in CI on the built image, with policy that fails the build on critical vulnerabilities in reachable code paths. The "reachable" qualifier is what separates useful gates from noise.

The minimum policy. Block on critical CVEs with a known patch in the runtime layer (your code's direct dependencies). Warn on critical CVEs in base layer (often unpatchable in a release window). Allow with a ticket on high CVEs anywhere. The policy must be tunable; rigid policies generate exception culture.

Antipatterns

Treating vulnerability count as the metric. Reachability matters more than count. A reachable medium-CVSS bug is worse than 50 unreachable highs.

Scanning at runtime instead of build. Build-time gates prevent the bad image from existing. Runtime-only catches the issue after deploy, too late.

Disabling scans because of false positives. Tune the allowlist instead. The next real vulnerability deserves to fire the alarm.

What to do this week

Three moves. (1) Pick your most-exposed image and apply changes 1, 2, and 4 from the seven above. Time it: usually under 30 minutes per image. (2) Add Trivy as a CI gate that fails on critical CVEs with a patch available. (3) Adopt readOnlyRootFilesystem for one stateless service and verify nothing breaks; expand from there.