AI Safety & Governance

A big red button you can actually press,
stop one agent, one tenant, or all of them

Kill Switch is the panic stop for the AI fleet. One agent acting weird? Pause it. One tenant misbehaving? Quarantine it. Whole platform looking off? Halt everything. Sub-second propagation, three scopes, fully audited. Designed to be safe to press.

Get Started Talk to Sales

app.novaaiops.com / kill-switch

● LIVE

Kill switch · status

GLOBAL

all agents

TENANT

novabank-staging

AGENT

cost-trimmer

14:42operator killed cost-trimmer · tenant=novabank-staging

14:18tenant kill released · novabank-staging

Three Scopes

Match the blast radius to the problem

Different incidents need different scopes. A misbehaving agent gets paused at the agent scope (everyone else keeps working). A bad tenant deploy gets quarantined at the tenant scope. A platform-wide regression gets the global kill. The button only does what its scope is configured for, so you cannot accidentally halt the world.

✓
Agent scope: pause one agent across all tenants, used for "this agent is generating noisy false positives"
✓
Tenant scope: pause every agent for one tenant, used during a tenant-side incident or maintenance
✓
Global scope: pause every agent for everyone, used during platform incidents or model regressions

app.novaaiops.com / kill-switch · scopes

Scope reach

agent1 agent · all tenants

tenantall agents · 1 tenant

globalall agents · all tenants

permissionplatform-admin only (global)

permissionorg-admin (tenant)

permissionteam lead (agent)

Read-Only on Kill

Stopping is safe, never destructive

When you press kill, every running agent transitions to read-only mode. They can still observe, log, and reason, they just cannot execute any tool that mutates state. In-flight tool calls finish or roll back to a checkpoint. Nothing that was healthy becomes unhealthy because you pressed kill.

✓
In-flight checkpointing: long-running tool calls roll back to the last checkpoint, not their inception
✓
Observability stays on: agents keep producing diagnostics so you can see why you killed them
✓
No data loss: kill never drops queued signals, they wait in the queue for the un-kill

app.novaaiops.com / kill-switch · safety

Kill semantics

14:42:00operator pressed global kill

14:42:00.432 in-flight tool calls checkpointed

14:42:00.74 long-running tasks rolled back to last checkpoint

14:42:00.9fleet read-only · 0 errors

14:48:12operator released kill · fleet warm-restart

When to Press It

Three patterns we see in practice

Patterns where teams reach for the kill switch: (1) a model upgrade from your provider produced regressions and the fleet is over-acting, (2) you are running a chaos game day and want to take humans-only for an hour, (3) a downstream provider (cloud, monitoring, paging) is degraded and you want to stop acting on stale data.

✓
Pattern 1: model regression: press global kill the moment success rate drops below your threshold, hold until you re-pin a model
✓
Pattern 2: game day: press tenant kill on the staging tenant, run the game day, release when done
✓
Pattern 3: data plane degraded: press global kill, wait for the upstream to recover, release, the agents resume from a clean signal

app.novaaiops.com / kill-switch · runbook

Suggested runbooks

model regressionglobal · 30m

chaos game daytenant · variable

upstream outageglobal · until clear

noisy single agentagent · until tuned

Audit & Recovery

Every press leaves a paper trail

Pressing kill writes a row to Agent Ledger with the operator id, the scope, the optional reason, and the duration. Releasing kill records the warm-restart sequence. Use the report view to see how often you are pressing each scope and whether you are converging on stability or hitting the same wall every week.

✓
Required reason on global: global kill requires a free-text reason so the post-incident review has the why
✓
Warm-restart on release: agents come back with a 60s ramp so you do not slam the data plane on un-kill
✓
Weekly trend report: kill count per scope, mean dwell time, top reasons, emailed to platform-admin

app.novaaiops.com / kill-switch · log

Kill log · this month

global

tenant

agent

avg dwell

22m

apr 22global · model regression · 38m

apr 18tenant · novabank-staging · 14m

apr 11agent · cost-trimmer · 4h

Video walkthrough coming soon

Subscribe to Nova AI Ops on YouTube for demos, tutorials, and feature deep-dives.

Confidence comes from being able to stop

Adopting AI for ops feels safer when you know exactly how to halt it. Kill Switch is that lever, and pressing it never breaks anything.

Get Started Request a Demo

A big red button you can actually press,stop one agent, one tenant, or all of them