Failed Deploy Cleanup

Failed deploys leave artifacts. Clean up.

Failed deploys leave artifacts

Failed deploys leave debris. Half-applied Terraform, partial Helm upgrades, orphaned cloud resources; each piece is a future incident waiting to be triggered by an unrelated change. Cleanup discipline keeps the environment from becoming the next outage.

Cleanup checklist

The cleanup checklist runs the same way every time. Per-tool reconciliation command, named runbook, no improvisation under post-failure pressure.

Automate detection

Detection catches what manual cleanup misses. Drift checks, naming conventions tied to deploy ID, and daily reports together surface the orphans before they become incidents.

Rollback discipline

Rollback discipline prevents improvising under pressure. Define rollback before the deploy ships, practice quarterly in staging, auto-rollback on safe SLO regressions and reserve manual for high-blast-radius changes.

How to install the discipline

Installing the discipline takes three pieces. Required runbook step gates closure, auto-ticket on failure assigns ownership, quarterly audit drives accountability with visible numbers.