Reliability Engineering

Where is the traffic flowing today,
and which slice is regressing under load

Traffic Distribution is the live picture of every traffic split: canary versus stable, A versus B, this region versus that region, this LB pool versus the next. The page also overlays SLI values per slice so a regressing canary stands out the moment it does, not after the rollout completes.

Get Started Talk to Sales
app.novaaiops.com / traffic-distribution
● LIVE
Live
split per service
Per-slice
SLI values
Auto
pause on regression
< 60s
detection latency
Splits Tracked

Canary, region, pool, A/B

Four split kinds are tracked automatically. Canary: percentage on the new release. Region: traffic by AWS/GCP region. LB pool: server pool inside a region. A/B: experiment variant. The page shows current split percentages and per-slice SLI values side by side so the regressing slice is the one that visibly diverges.

  • Canary tracking: integrated with Argo, LaunchDarkly, Optimizely, Unleash; auto-detects the rollout shape
  • Region splits: per AWS/GCP/Azure region with both server and client perspective
  • LB pool splits: per ALB/GLB target group; useful for diagnosing pool-specific bad nodes
  • A/B variant splits: experiment-aware so you see traffic per variant alongside conversion metrics
app.novaaiops.com / traffic-distribution · splits
Per-Slice SLIs

Same SLO, different slice numbers

The SLI values defined in SLO Management are computed per slice. The page shows the stable slice's p95 next to the canary's p95, with explicit divergence callouts. A 3x divergence is highlighted in red. The same logic applies to error rate, saturation, and any custom SLI.

  • Same SLI per slice: reuses the same SLI definition, computed independently per slice
  • Divergence flagging: > 1.5x divergence yellow, > 3x red, visible without reading numbers
  • Custom SLIs work too: product SLIs (cart-success, video-start) compute per slice the same way
app.novaaiops.com / traffic-distribution · sli
Auto-Pause

Bad slices stop receiving more traffic

When a slice diverges past your configured threshold, Nova auto-pauses the rollout. Argo / LaunchDarkly / similar are paused via their APIs; raw deploys are paused at the load balancer level. The pause halts ramp; existing traffic on the bad slice keeps flowing for ~5 minutes so you can confirm the regression before fully rolling back.

  • Threshold-driven pause: > 3x p95 divergence pauses by default; tunable per service
  • Native rollback APIs: Argo, LaunchDarkly, Optimizely, Unleash, paused via their official APIs
  • 5-minute confirm window: the bad slice keeps receiving its share so the pattern can be verified before full rollback
app.novaaiops.com / traffic-distribution · auto-pause
A/B Aware

Experiment math, not just averages

For A/B variants, the page does not just show per-slice SLIs; it shows the statistical-significance of the divergence. A small divergence on a small sample is not a regression; a small divergence with a million-session sample is. The math respects the experiment design (CUPED, frequentist, Bayesian, whichever your product team uses) so the page agrees with the experiment platform.

  • Significance-aware: small divergences on small samples are flagged "not significant," not "fine"
  • CUPED / Bayesian / frequentist: matches your experiment platform's math so two screens never disagree
  • Cross-link to experiments: every A/B split links to the experiment in your platform of choice
app.novaaiops.com / traffic-distribution · ab
Video walkthrough coming soon

Subscribe to Nova AI Ops on YouTube for demos, tutorials, and feature deep-dives.

A rollout you can read at a glance

Canary deploys exist to catch bad rollouts. Traffic Distribution catches them at 5% so you do not learn about them at 100%.

Get Started Request a Demo