A Postmortem Action-Item Tracker That Sticks
Most action items die in a backlog within three weeks. Six fields per item, two SLAs, and a 15-minute weekly review will fix that. Here’s the version that’s survived contact with my last four teams.
Why action items die
The pattern is universal. Postmortem produces seven action items. Three get done in the first two weeks (the easy ones). Two are still open after a quarter (the hard ones). Two are silently dropped because nobody remembered they existed. Six months later, the same incident happens again because of one of the dropped items.
The fix isn’t willpower or process discipline. It’s structure. The reasons items die are predictable: vague ownership, no deadline, no visible status, no review rhythm. Address those four and the completion rate goes from ~50% to ~85% with no extra meetings.
The six fields
Every action item, in your tracker of choice (Linear, Jira, GitHub Issues, a shared sheet, the tool doesn’t matter, the structure does):
- Title. One line. Verb-first. “Add per-partition CloudWatch alarm” not “DynamoDB monitoring”.
- Source incident. A link to the postmortem, not a paraphrase. Future-you in three months will want to see the original context.
- Owner. A specific person, not a team. Teams don’t close action items; people do. The owner can change as people leave; that’s fine.
- Class. One of four:
prevent(stops the bug from happening again),detect(catches it faster),respond(helps responders),communicate(improves comms). Helps you see the shape of the work later. - Due date. A real date. “Q3” is not a date. “September 15” is.
- Verification. How you’ll know it’s actually done. “Alarm fires in next game-day” not “deployed”.
Six fields, two minutes per item to fill out. The verification field is the one that gets skipped most often and matters most. Items without verification get marked “done” long before they’re actually working.
Two SLAs that matter
One SLA per class of severity. We use just two:
- SEV-1/SEV-2 action items: 30 days to ship. The incident was customer-impacting and material. The fixes can’t wait a quarter. If something can’t ship in 30 days, the action item needs to be split into a smaller piece that can.
- SEV-3 action items: 90 days to ship. The incident was small or near-miss. There’s some room. But 90 days is a hard ceiling; past that, the item gets explicitly killed (see below) rather than dragged forward indefinitely.
The deadlines are visible. Anyone on the team can see the action item list, sorted by days-until-deadline. Public deadlines that anyone can see are different from private deadlines that only the owner remembers; the social pressure does most of the work.
The 15-minute weekly review
Once a week, 15 minutes, same time slot. The team lead or designated owner walks the list. For each open item:
- Days remaining? (visible from the tracker)
- Owner: still you, or do we need to reassign?
- Status: in progress, blocked, or hasn’t started?
- If blocked, what’s the unblock?
That’s the whole meeting. 4 questions per item, ~10 items typically open at any time, 90 seconds per item. Done in 15 minutes. The point isn’t to micromanage the work; it’s to make the work visible. Items that haven’t moved in three weekly reviews are either dead or need a different owner.
Don’t skip the review when you’re busy. The weeks you skip are the weeks the action items quietly slip. We’ve missed the review three times in two years; each time the next-quarter completion rate dropped 8-12%.
When to kill an action item
Some action items shouldn’t ship. The team learned something during the work, the priority changed, or the action item turned out to be a bad idea on closer inspection. Killing items is fine; pretending they’re still active is not.
The kill criteria, in order of frequency:
- Solved differently. Another piece of work made this item unnecessary. Note which work, close the item.
- Re-prioritised. The team explicitly decided not to do this. Note the trade-off, close the item with a “rejected” status.
- Bad idea. After thinking about it, this would make things worse. Note why, close.
- Stale. 90 days past deadline with no movement. Treat as a forced kill: either it’s actually being worked on (in which case give it a new deadline) or it’s dead (in which case admit it).
The killed-with-reason item is the most useful artefact in the long run. Three years later, when someone proposes the same fix, the kill notes explain why it didn’t happen the first time. Saves them three weeks of work.
The two metrics worth tracking
Two numbers, reported quarterly, on the team’s reliability dashboard:
- Action-item completion rate by class.
preventitems completed /preventitems created. Same for the other three classes. Differences across classes are interesting: most teams completeresponditems at 90%+ andpreventitems at 40%, which tells you the team is good at firefighting and bad at fire prevention. That’s actionable. - Median days-to-close. Whatever your SLA is, the median should be 60-70% of it. If your 30-day SLA has a median close at 28 days, you’re running hot, items are barely making it. Aim for slack.
The metrics aren’t for shaming. They’re for spotting where the postmortem culture is degrading before the next big incident exposes it. A team whose prevent completion rate drops from 70% to 40% over two quarters is a team about to have the same outage twice.
The wiki line, on a sticker we put on people’s laptops: “An action item without an owner, a deadline, and a verification is a wish. Track wishes separately.”