Incident Tool Rights and Read-Only Mode
Some incident tool changes shouldn't happen during incidents. The rights model.
Freeze
Incident tool rights is the discipline of bounding who can change incident tools during active incidents. The discipline prevents accidental misconfiguration during stress.
What freezing looks like:
- During active sev 1: configuration changes to incident tools blocked.: When a sev 1 incident is in progress, changes to the incident management tools, paging system, alerting platform are blocked.
- Prevents accidental misconfig during stress.: Stress reduces engineers' attention. Changes during active incidents are higher-risk; the freeze removes the risk; the discipline produces stability.
- Per-tool freeze.: Different tools can have different freeze policies. The team's discipline matches the criticality of each tool.
- Configurable freeze duration.: The freeze applies during the incident. Once the incident is resolved, the freeze lifts; the discipline accommodates ongoing operations.
- Communicated to the team.: The freeze is communicated. Engineers know it is in effect; the discipline is visible.
Freezing is the protection. The team's incident response is shielded from concurrent changes.
Emergency
Some changes during incidents are necessary. The discipline includes an emergency override path; the path is documented and logged.
- Specific override: documented.: The override is explicit. The team's runbook documents how to override; the engineer must consciously bypass the freeze; the discipline catches casual changes.
- Logged.: Every override is logged. The discipline produces audit trail; future investigation can trace what was overridden.
- Used only when fix-during-incident is required.: The override is for genuine need. Bug in the alerting causing the incident to spread; the override allows fixing it; the discipline accommodates real cases.
- Approval may be required.: Some teams require approval for the override. The bar is high; the discipline filters out unnecessary changes.
- Document the rationale.: When the override is used, the rationale is documented. The discipline produces accountability.
The emergency path is the exception. The discipline accommodates real needs while preventing casual misuse.
Post-incident
After the incident, the team reviews what changed. The post-incident audit catches drift; the discipline closes the loop.
- Review what changed during the incident.: The post-incident review includes tool changes. What was overridden; what was modified; the team understands the cumulative changes.
- Catches drift early.: Some changes during incidents become permanent. The review catches them; the discipline ensures they are deliberate; drift accumulates only by intent.
- Document changes.: The team documents the changes. Future incidents reference the documentation; the discipline is preserved.
- Revert if appropriate.: Some changes were emergency-only. After the incident, they revert; the discipline restores the prior state.
- Lessons feed back.: The team's lessons feed into improvements. New automation, better defaults, clearer runbooks all emerge; the discipline produces continuous improvement.
Incident tool rights is one of those operational disciplines that produces stability during stress. Nova AI Ops integrates with incident management tools, surfaces patterns, and supports the team's discipline.