Reliability Engineering

What if we had not done that step,
one action removed, replay the rest

Counterfactual Replay re-runs a closed incident with one action edited out. It rebuilds the timeline against the digital twin and predicts the outcome without that action. Use it during postmortems to test "did that step actually help?" Most teams find one or two recurring runbook steps that contribute nothing measurable.

Get Started Talk to Sales

app.novaaiops.com / counterfactual-replay

● LIVE

Counterfactual · inc-4821

action removedstep 4 · cache flush

actual outcomeresolved at 14:33 (15m)

counterfactual outcomeresolved at 14:32 (14m)

conclusioncache flush had ~0 impact · drop from runbook

How a Counterfactual Runs

Twin replays without one action

The replay starts with the captured incident state at start time (twin snapshot from the live archive). The engine plays back the timeline, skipping the action you marked for removal, and predicts the outcome at each subsequent step. The output is a paired comparison: actual outcome vs counterfactual outcome.

✓
Captured incident state: starts with the digital-twin snapshot from the moment the incident opened
✓
Skip one action: the replay plays every other step in order, predicting state without the skipped action
✓
Paired comparison: output is "actual vs counterfactual", same metric, two values, decision is one click

app.novaaiops.com / counterfactual-replay · how

Replay · steps

step 1 · pg_terminatekept

step 2 · vacuumkept

step 3 · pgbouncer-restartkept

step 4 · cache flushSKIPPED in counterfactual

step 5 · verifykept

Postmortem Tooling

A "did this step matter?" check

During the postmortem, click any timeline step and run a counterfactual on it. The results land in the postmortem document directly so the conversation has data, not opinions. Most retros want to know whether one disputed step actually helped, counterfactual gives a measurable answer.

✓
In-postmortem trigger: one click on any step opens the counterfactual; result lands in the postmortem doc
✓
Measurable answer: paired SLI movement, not "I think it helped", anchored in twin replay
✓
Stored on the postmortem: the counterfactual lives with the postmortem so future readers see the evidence

app.novaaiops.com / counterfactual-replay · postmortem

In postmortem · sample

# step 4 · cache flush counterfactual: removed actual: resolved 14:33 counterfactual: resolved 14:32 conclusion: skip in future

Runbook Trim Suggestions

When the same step is consistently neutral

When a runbook step shows neutral or negative impact across many incidents, the system flags it for trimming. The trim suggestion includes the evidence (the counterfactuals that motivated it) so the runbook owner can review with data. Trimming a step shrinks MTTR by the step's duration without losing recovery quality.

✓
Cross-incident pattern: looks for steps that are consistently neutral across many incidents
✓
Evidence packaged: the suggestion includes the counterfactual results that drove it
✓
Reviewed by owner: final decision sits with the runbook's owner; the system never auto-trims

app.novaaiops.com / counterfactual-replay · trim

Trim suggestion · payments-recovery

stepcache flush

incidents using14

net positive2 of 14

recommendationdrop · saves ~22s per recovery

Caveats

A counterfactual is a prediction, not a proof

Counterfactuals are predictions from a twin model, not ground truth. They are best treated as evidence to weigh, not absolute answers. The page shows confidence intervals on each prediction so users see when the model is unsure. For high-stakes runbook trims, run the counterfactual in production canary instead, replace prediction with measurement.

✓
Confidence intervals: every counterfactual reports an interval, not a point estimate
✓
Weigh, do not blindly act: the page recommends treating counterfactuals as one input, not the verdict
✓
Canary alternative: for high-stakes trims, run the new runbook in canary on the next incident as a follow-up

app.novaaiops.com / counterfactual-replay · caveats

Confidence example

delta-1m MTTR

95% CI-3m to +1m

verdictprobably neutral · canary in next incident

Video walkthrough coming soon

Subscribe to Nova AI Ops on YouTube for demos, tutorials, and feature deep-dives.

Stop running runbook steps that do nothing

Counterfactual Replay is the evidence base for "we should remove that step." Anchored in data, not preference.

Get Started Request a Demo

What if we had not done that step,one action removed, replay the rest