Evidence Preservation
Snapshots before clean-up.
Overview
Evidence preservation captures system state before mitigation cleans it up. Logs roll over, dashboards reset, configs change as part of the fix. Without explicit preservation, the postmortem reconstructs the incident from memory and Slack scrolls. With it, the postmortem works from the actual data.
- Snapshots before clean-up. Capture state before restart or rollback. The forensic data survives the mitigation.
- Log archival. Logs to long-term storage during the incident. Default retention may not cover the postmortem window.
- Dashboard screenshots at peak. Visual evidence of the incident as it looked. Dashboard state at peak is otherwise irrecoverable.
- Configuration capture plus trace samples. Save current config before change for rollback evidence; representative trace samples for analysis.
The approach
Three habits make evidence preservation reliable rather than wishful: automate the snapshot capture, document the checklist for on-callers, and review during postmortems whether the right evidence was captured.
- Automated snapshots. Incident-triggered automation captures logs, dashboards, traces, and config. Removes the “we forgot to save it” failure mode.
- Documented checklist. Per-team the snapshot list. New on-callers know what to capture before they restart anything.
- On-caller training. Evidence preservation is part of incident training. Scales the practice across rotations.
- Postmortem review of evidence plus documented policy. “Did we capture what we needed?” per postmortem; per-team the evidence policy documented for compliance reviews.
Why this compounds
Each preserved snapshot makes the next postmortem better. The team’s investigation capability deepens; cross-incident analysis becomes possible because the data exists; legal and compliance reviews work from real evidence rather than reconstructed memory.
- Postmortem quality improves. Real evidence drives real analysis. Speculation drops out.
- Legal posture sharpens. Evidence supports incident response and customer disputes.
- Cross-incident analysis. Preserved data supports retrospective trend reviews across the year.
- Year-one investment, year-two habit. First checklist is heavy lift. By the third incident, evidence capture is reflexive.