AI Agent Operations

One incident, not eighty pages,
related alerts grouped automatically

Incident Correlation Engine groups related alerts into single incidents. When 80 services start erroring because the database is slow, you get one page about "database degraded," not 80 service pages. The engine uses the service graph, time proximity, and shared root signals to identify related alerts; ungrouping is one click if the engine got it wrong.

Get Started Talk to Sales

app.novaaiops.com / incident-correlation-engine

● LIVE

Grouping · 14:18 storm

raw alerts fired82

groups formed3

incident-1 · payments cluster68 alerts

incident-2 · cart slow12 alerts

incident-3 · independent · analytics-batch2 alerts

noise reduction82 → 3 (96%)

Three Grouping Signals

Service graph, time, shared symptom

The engine uses three signals to decide if two alerts belong together. (1) Service-graph proximity: alerts on services connected by the live service map are likely related. (2) Time proximity: alerts within 60 seconds of each other on connected services are very likely related. (3) Shared symptom: alerts citing the same upstream cause (database, redis, queue) consolidate.

✓
Service-graph proximity: connected services with simultaneous alerts get grouped, graph is the prior
✓
Time proximity: within-60s arrival on related services is treated as the same root event
✓
Shared symptom: when alerts cite the same upstream (e.g., "rds slow"), they consolidate even across distant services

app.novaaiops.com / incident-correlation-engine · signals

Grouping · payments cluster

service-graphpayments → cart, refunds, checkout

timeall within 38s

shared symptom"rds connection pool"

verdictone incident

Ungrouping

When the engine is wrong, one click

Sometimes two genuinely separate incidents fire at the same time. The engine may group them; the operator can split. One click ungrouping splits any group into two new incidents and re-routes the on-call accordingly. The split is recorded so the engine can learn from it.

✓
One-click split: select alerts to move to a new incident; routing rebuilds automatically
✓
Learning loop: splits feed back into the model so the engine gets better over time
✓
Audit logged: every split records who split, why, and when, for postmortem review

app.novaaiops.com / incident-correlation-engine · ungroup

Split · sample

14:22operator split inc-4821 → inc-4821 + inc-4824

14:22reason: "analytics-batch unrelated to payments"

14:22routing: payments-on-call kept; analytics-on-call paged

Tunable Aggressiveness

Group more or group less, your call

Default tuning is moderate: prefer grouping when the signals are strong. You can dial more aggressive (group at weaker signals, fewer pages, more risk of grouping unrelated incidents) or less aggressive (group only at strong signals, more pages, less risk of bundling). Per-tenant config; ships with sensible defaults.

✓
Three modes: aggressive, moderate (default), conservative, tunable per tenant
✓
Visible tradeoff: each mode shows expected page count vs split risk on the page
✓
A/B-able: try a different mode for a week; the page reports whether grouping precision improved or got worse

app.novaaiops.com / incident-correlation-engine · tuning

Modes

aggressive~92% noise reduction · 8% mis-group

moderate (default)~80% reduction · 2% mis-group

conservative~50% reduction · < 1% mis-group

Reporting

Noise reduction as a number

Weekly report: alerts fired, incidents formed, noise-reduction percentage, mis-group rate (from operator splits), and trend. Use the report to defend the value of the platform when finance asks "what does Nova actually save us?", pager-fatigue costs are a real number, not just a vibe.

✓
Weekly digest: alerts, incidents, reduction %, mis-group rate, emailed Monday
✓
Trend: reduction % over time; tightening configs improves it
✓
Audit-friendly: every grouping decision is logged; auditors can verify nothing important was bundled away

app.novaaiops.com / incident-correlation-engine · report

This week

alerts

2.4k

incidents

412

reduction

82%

mis-groups

One incident, not eighty pages,related alerts grouped automatically