Emergency Credential Rotation
Credentials compromised. The emergency rotation.
Playbook
Emergency credential rotation is the operational discipline you hope you never need. Rotating routine credentials on a schedule is straightforward; rotating credentials urgently because they might be compromised is high-stakes. The team that has practiced the procedure executes it in under an hour; the team that has not takes most of a day and may make mistakes that compound the original incident.
The standard playbook has four phases:
- 1. Identify all uses.: Where is this credential used? Which services, which environments, which CI pipelines, which third-party integrations. The inventory must be complete; missing a use means the credential gets rotated incompletely and one consumer breaks while another remains compromised.
- 2. Rotate the credential at the source.: Generate a new credential value at the source system (cloud KMS, IAM, third-party vendor's console). Note the new value securely. The rotation timestamp is captured for the audit trail.
- 3. Update consumers.: Push the new value to every consumer identified in step 1. The exact mechanism depends on the credential type: secret manager update, environment variable rollout, config file deploy. Each consumer is updated; verification confirms the update took.
- 4. Verify and invalidate.: Verify the consumers are using the new credential by checking their behavior or audit logs. Once verified, invalidate the old credential at the source. The window where both old and new are valid should be minimized; once consumers are confirmed working, the old value is dead.
- Tested quarterly.: The playbook is exercised quarterly with a test credential. Each test produces refinements: the inventory was incomplete; the deploy mechanism was slower than expected; a consumer's update path had a bug. The refinements feed back into the playbook.
The four phases sound simple. The discipline is having them documented, the inventory current, and the team practiced enough to execute under pressure.
Speed
The window between detecting a credential compromise and completing rotation is the window during which an attacker can use the credential. Minimizing this window is the operational goal. Industry-mature teams target under one hour; teams without practice take much longer.
- Target: under 1 hour from suspicion to rotated.: The aspirational target. Detection happens; the playbook fires; rotation completes within 60 minutes. The compromise window is bounded.
- Practice produces speed.: The team that has run the playbook quarterly executes it in 30 minutes. The team that runs it the first time during a real incident takes hours. The difference is muscle memory; the practice is what builds the muscle.
- Automate where possible.: The mechanical parts of the playbook (rotating in the secret store, deploying the new value, verifying consumers) can be automated. The human attention focuses on the judgment calls (is this really a compromise; what is the blast radius). Automation accelerates the response.
- Pre-staged tooling.: The tooling that the rotation needs (secret manager access, deploy automation, verification scripts) is configured and ready before the incident. The team is not setting up tooling during the response.
- Cross-team coordination is the slowest step.: When credential consumers span multiple teams, the rotation requires those teams to coordinate. The cross-team handoff is where time is lost. Pre-defined contact paths and escalation procedures speed this up.
Speed in credential rotation directly limits the damage of compromise. The investment in playbook practice pays back any time a real rotation is needed.
Audit
Every emergency rotation is captured in the audit trail. The capture is what makes the rotation defensible later: to compliance auditors, to legal investigators, to internal post-incident review.
- What credential was rotated.: Specific identity. Specific scope. The audit record lets future investigators understand exactly what changed. Vague records ("we rotated some keys") are not useful; specific records are.
- When it happened.: Detection time, rotation start, rotation completion, old credential invalidated. Each timestamp is captured. The compromise window is computable from the timestamps.
- Why it happened.: The trigger that caused the rotation. A specific incident, a specific suspicious access pattern, a specific external report. The reason is documented; the rotation has clear provenance.
- Who executed.: The person or team that ran the rotation. The approver who authorized it. The on-call who was notified. The accountability chain is captured.
- Compliance trail for audits.: SOC 2, HIPAA, PCI DSS all require evidence of incident response. The emergency rotation audit record is part of that evidence. Auditors verify that the team responded appropriately when compromise was suspected; the record is the verification.
Emergency credential rotation is one of those operational disciplines where preparation is everything. Nova AI Ops integrates with secret stores and deploy automation, surfaces the inventory of credential consumers, and produces the structured audit record that compliance frameworks need from emergency-rotation events.