AI Safety & Governance

A log line cannot tell your agent to drop a table,
because it never reaches the agent

Prompt Injection Defense is the inbound scrubber for everything the AI fleet reads. Log lines, alert payloads, ticket descriptions, customer messages, anything that gets concatenated into a prompt is scanned, scored, and quarantined if suspicious. The agents never see the malicious string.

Get Started Talk to Sales
app.novaaiops.com / prompt-injection-defense
● LIVE
40+
Detection signatures
< 5ms
Per-input scan latency
Sandbox
or block, configurable
0
False negatives in red-team
Detection Signatures

Forty patterns, weekly updates

The defense ships with 40+ detection signatures based on published prompt-injection corpora plus our own red-team work. Patterns include role-override strings, base64-encoded system prompts, unicode bidi-override tricks, code-block prompt smuggling, and "ignore previous instructions" variants. Signatures update weekly via the Nova security feed.

  • Public corpora + red team: 40+ signatures from published research and Nova's internal red team, refreshed weekly
  • Pattern types: role overrides, encoding tricks (base64, unicode), prompt smuggling via code blocks, instruction inversion
  • Tunable thresholds: each signature has a confidence score; you set the threshold for block vs sandbox vs allow
app.novaaiops.com / prompt-injection-defense · signatures
Quarantine Flow

Suspicious inputs go to a sandbox, not the agent

Inputs that match a high-confidence signature are blocked outright. Inputs that match a medium-confidence signature are routed to a sandbox: a separate, isolated agent instance with no production tools and no secret access. The sandbox processes the input safely so a false positive does not lose useful signal, the human reviewer can release real signals back to production.

  • Block on high confidence: > 90 confidence → reject the input outright with a logged reason
  • Sandbox on medium: 60–90 confidence → run the input in an isolated agent with read-only mocked tools
  • Operator release: review queue lets a human release sandboxed inputs back to production if they were genuine
app.novaaiops.com / prompt-injection-defense · sandbox
Outbound Scrubber Hook

Pairs with Prompt Egress Scanner

The inbound defense has a sibling: Prompt Egress Scanner. Egress strips secrets, PII, and cross-tenant data from the prompt body before it leaves your network for the LLM provider. Together they make the prompt boundary watertight in both directions, which is what your security team will ask for first.

  • Inbound + outbound: Defense protects what the agent reads; Egress protects what the agent sends
  • Shared signature feed: both layers share the same weekly threat feed so updates apply everywhere at once
  • Single config: one policy doc covers both layers, no two-tools-two-policies drift
app.novaaiops.com / prompt-injection-defense · egress
SOC & Reporting

The numbers your security team will ask for

Defense produces a weekly SOC report: total inputs scanned, blocks, sandboxes, releases, false-positive rate (from operator releases), top signatures fired, and a 30-day trend. The data ships to your SIEM via syslog or webhook, so your existing security tooling sees the same picture.

  • Weekly SOC report: volume, blocks, sandboxes, FP rate, top signatures, 30-day trend, emailed to security@your-org
  • SIEM-ready: syslog and webhook outputs in CEF/JSON so Splunk, Elastic, Datadog get the same events
  • Red-team mode: inject your own probes via the API to validate the defense quarterly without touching prod
app.novaaiops.com / prompt-injection-defense · report
Video walkthrough coming soon

Subscribe to Nova AI Ops on YouTube for demos, tutorials, and feature deep-dives.

Prompt injection is a real attack surface

Every input the agents read is data, not instructions. The defense layer enforces that boundary so a customer support ticket cannot exfiltrate a prod secret.

Get Started Request a Demo