Cloud Account Lockout Procedures
Compromised accounts. The lockout.
Immediate
Cloud account compromise is one of the highest-impact security incidents an organization can experience. An attacker with control of a cloud account can delete data, exfiltrate everything, mine cryptocurrency at the company's expense, and pivot into adjacent infrastructure. The response must be fast and structured; the discipline is having a runbook ready before the incident, not inventing one during it.
The first phase: immediate containment.
- Disable IAM users.: Every IAM user that might be compromised gets their access keys disabled and their console password reset. Users that should not have been logged in get disabled immediately; users that were logged in get their sessions revoked. The action is mechanical; the goal is cutting off attacker access fast.
- Revoke active sessions.: AWS sessions persist for up to 36 hours after the credential is created. Disabling the user does not revoke existing sessions automatically. The aws sts revoke-session command (or the equivalent at the IAM level) terminates active sessions; the attacker's current authentication is killed.
- Within minutes.: The window between detecting the compromise and revoking access is critical. Each minute the attacker has access, more damage can be done. The response is automated where possible (alerts trigger automatic disabling); manual where automation is not available.
- Lock down credentials at all layers.: Access keys, console logins, federated identities, root account credentials. Each is a separate layer; each must be locked down. An attacker who lost console access but kept access keys is still in the account.
- Document every action.: The incident timeline captures every containment action: what was disabled, when, by whom. The record is the foundation of the post-incident analysis and the forensic investigation.
The immediate phase is about stopping the bleeding. Investigation and remediation come after; first cut the attacker off.
Forensic
Once the attacker is locked out, the investigation begins. The forensic phase is about understanding what happened: how the attacker got in, what they did while inside, what data may have been exposed. The investigation must preserve evidence, not destroy it.
- Snapshot for forensics.: Before any remediation that might destroy evidence, snapshot the state. EBS volumes, RDS databases, container filesystems. The snapshots are stored separately from production; they preserve what the attacker left behind.
- CloudTrail review.: The cloud provider's audit log (AWS CloudTrail, GCP Cloud Audit Logs, Azure Activity Log) records every API call. The investigation reviews the log for the suspect time window: what credentials were used, what APIs were called, what resources were touched. The log is the primary forensic source.
- VPC flow logs.: Network flow logs show traffic to and from compromised resources. Data exfiltration shows up as outbound traffic to external IPs; lateral movement shows up as cross-VPC or cross-region calls. The flow log is the secondary forensic source.
- Don't destroy evidence.: The instinct during an incident is to fix what is broken. The discipline is to preserve evidence first, fix later. Restarting compromised hosts, redeploying pods, rotating credentials all destroy evidence. The fix waits until the snapshot is in flight.
- Engage external forensic specialists if needed.: For serious incidents, internal forensic capability may not be enough. External specialists (Mandiant, CrowdStrike, smaller IR firms) bring specialized tools and experience. The contract for IR services is signed before the incident; engaging them during is a phone call.
The forensic phase is what makes the incident a learning event rather than a mystery. The investigation produces evidence; the evidence produces structural improvements; the improvements prevent the next compromise.
Recover
The recovery phase is where the team brings the account back to normal operation. The discipline is doing this carefully rather than quickly. Rushing recovery before the investigation is complete is how compromises persist past the initial event.
- Rotate everything.: Every credential in the account is assumed compromised: access keys, service account keys, application secrets, database passwords, OAuth tokens. Each is rotated to new values. The old values are revoked. The blast radius of any leaked credential is bounded to the time before rotation.
- Restore access only after sign-off.: Disabled users are not re-enabled until the security team has reviewed the incident, confirmed the user is not compromised, and signed off. Restoring access too quickly can re-introduce the attacker if their persistence mechanism was through that user.
- Don't rush back.: The pressure to restore normal operation is real but the cost of restoring before the incident is fully understood is larger. A compromise that recurs because the team rushed recovery is a compromise that was never actually contained.
- Apply the structural fix.: The incident retrospective produces structural changes (better MFA enforcement, IAM condition tightening, audit-log alerting that would have detected the compromise faster). The recovery includes shipping these fixes, not just restoring access.
- Update runbooks.: The incident produces lessons. Runbooks get updated. Procedures get tightened. The next incident benefits from the learnings of this one. The discipline is making sure the lessons land before they fade.
Cloud account compromise response is one of those operational disciplines that pays back in the cases where it matters most. Nova AI Ops integrates with cloud audit streams, surfaces the anomalous patterns that indicate compromise in flight, and produces the structured runbook that the team can follow when the response has to happen at speed.