Reliability Engineering

Where does time go during incidents,
measured across every timeline you have

Timeline Analytics aggregates the phase data from every incident timeline. How long from page to ack? Ack to first action? First action to resolved? Across the whole quarter, where is the time going? Most teams find one phase that dominates, fixing it cuts MTTR more than any other change.

Get Started Talk to Sales
app.novaaiops.com / timeline-analytics
● LIVE
4
Phases tracked
p50/p95
per phase
Per-team
breakdown
90d
default window
Four Phases

Page, ack, fix, comm

Every incident has four phases: page (alert fired) → ack (first responder responded) → first action (something was tried) → resolved (incident closed) → comm (status page or stakeholder update). Each phase has its own typical duration; each can become a bottleneck. The page reports p50 and p95 per phase across your incidents.

  • Page → ack: pure responsiveness; fast for most teams; slow if pager fatigue is real
  • Ack → first action: often the slowest phase; "what do we do?" is hard at 3am
  • First action → resolved: execution speed; depends on tooling, runbooks, and on-call experience
  • Resolved → comm: often forgotten; customers wait while engineering moves on
app.novaaiops.com / timeline-analytics · phases
Per-Team Breakdown

Each team has its own bottleneck

Different teams have different bottlenecks. The payments team might struggle with ack times (small team, lots of pages). The fulfillment team might have great ack but slow first actions (complex domain, sparse runbooks). The breakdown lets each team see its own dominant phase and target it.

  • Per-team page: every team gets its own breakdown; defaults to your team based on login
  • Compare across teams: side-by-side compare two teams to spot best-practice differences
  • Linked to team retros: the dominant-phase number anchors the retro; "let's halve our ack-to-action this quarter"
app.novaaiops.com / timeline-analytics · team
Drill-Down to Outliers

Worst incidents tell you what to fix

For each phase, the page lists the slowest incidents in the window. Click any to see the timeline replay. Patterns emerge fast: same phase always slow on the same kind of incident usually points at a missing runbook or a missing tool integration.

  • Slowest 10 per phase: always sortable; see what made the long-tail long
  • Replay link: every outlier links to the war-room replay so the bottleneck is visible
  • Pattern detection: when 5+ outliers share a class, Nova suggests a fix (runbook, agent coverage, escalation)
app.novaaiops.com / timeline-analytics · outliers
Tracking Improvement

Before/after when you ship the fix

Mark a date on the chart when you ship a fix (a new runbook, an agent that covers the class). The page draws a vertical line and computes the before/after delta on the affected phase. Use it to prove improvements rather than claim them.

  • Annotated dates: mark improvement dates on the chart; multiple annotations supported
  • Before/after delta: computed automatically per phase; published with confidence interval
  • Use in retros: evidence that the runbook actually helped, not just that you wrote it
app.novaaiops.com / timeline-analytics · improvement
Video walkthrough coming soon

Subscribe to Nova AI Ops on YouTube for demos, tutorials, and feature deep-dives.

The phase you fix is the MTTR you save

Aggregating timelines turns "we need to be faster" into "the ack-to-first-action phase is the bottleneck, here is what to do about it."

Get Started Request a Demo