Counterfactual Replay re-runs a closed incident with one action edited out. It rebuilds the timeline against the digital twin and predicts the outcome without that action. Use it during postmortems to test "did that step actually help?" Most teams find one or two recurring runbook steps that contribute nothing measurable.
The replay starts with the captured incident state at start time (twin snapshot from the live archive). The engine plays back the timeline, skipping the action you marked for removal, and predicts the outcome at each subsequent step. The output is a paired comparison: actual outcome vs counterfactual outcome.
During the postmortem, click any timeline step and run a counterfactual on it. The results land in the postmortem document directly so the conversation has data, not opinions. Most retros want to know whether one disputed step actually helped, counterfactual gives a measurable answer.
When a runbook step shows neutral or negative impact across many incidents, the system flags it for trimming. The trim suggestion includes the evidence (the counterfactuals that motivated it) so the runbook owner can review with data. Trimming a step shrinks MTTR by the step's duration without losing recovery quality.
Counterfactuals are predictions from a twin model, not ground truth. They are best treated as evidence to weigh, not absolute answers. The page shows confidence intervals on each prediction so users see when the model is unsure. For high-stakes runbook trims, run the counterfactual in production canary instead, replace prediction with measurement.
Subscribe to Nova AI Ops on YouTube for demos, tutorials, and feature deep-dives.
Counterfactual Replay is the evidence base for "we should remove that step." Anchored in data, not preference.