Idle Resource Detection and Cleanup
Idle resources are 5-15% of any mature cloud bill. Cleanup is mechanical once detection is in place.
Why idle accumulates
Idle resources are 5 to 15% of any mature cloud bill. They accumulate because nothing in the default cloud experience cleans them up; the ratchet only goes one way.
- Outlive creators. Engineers leave; environments are forgotten; volumes detached for 'just in case' sit forever.
- No default cleanup. Cloud providers do not auto-delete idle resources; the bill grows monotonically.
- Compounds quietly. Each idle resource is small; the sum at 6 months is substantial.
- Active cleanup required. Without explicit cleanup, idle resources only grow; the discipline must be designed in.
Four detection patterns
- 1. Unattached volumes (no instance for >30 days).
- 2. Old snapshots (older than retention policy).
- 3. Unused load balancers (no traffic for 14 days).
- 4. Idle dev environments (no logins for 30 days).
Auto-cleanup pipeline
Manual cleanup never sustains. The pipeline that scans, tags, archives, then deletes is the only approach that survives team changes.
- Daily scan. All accounts swept; idle candidates flagged based on the four detection patterns.
- Tag for archive. Flagged resources tagged for archive; owners notified via Slack or email.
- Weekly archive. Snapshot taken, resource paused or detached; restorable for 30 days.
- Monthly delete. Unless tagged keep-alive, archived resources permanently deleted; pipeline runs forever.
Owner-of-record opt-out
Aggressive automation without an opt-out path nukes legitimate resources. The opt-out has to exist, with discipline so it does not become a default escape.
- Keep-alive tag. Owners tag the resource keep-alive with a written justification.
- Quarterly review. Justifications reviewed; stale ones removed; the tag is not permanent.
- Notification window. 7 days notice before delete; reduces panic and supports last-minute opt-out.
- Audit trail. All deletions logged with prior tags; the trail satisfies the auditor and the engineer who needed it.
Antipatterns
- Manual cleanup. Always falls behind.
- Aggressive auto-delete with no notice. Outages.
- Cleanup without owner-of-record. No accountability.
What to do this week
Three moves. (1) Apply this lever to your highest-spend workload. (2) Measure the dollar impact for one month. (3) Roll the practice out to the next two services if the savings hold.