Agentic SRE Advanced By Samson Tanimawo, PhD Published May 1, 2026 5 min read

AWS Cost Anomaly Triage: Agent Patterns

Cost-Explorer flags an anomaly. The agent that pulls the lineage, decides whether to ticket, and writes the cost summary to Slack.

Trigger from Cost Explorer

AWS Cost Anomaly Detection flags an anomaly. The agent receives the service, the account, the date range, the dollar amount.

The agent's first action: pull the resource-level cost breakdown for that service in the date range.

Identify the resources contributing most to the anomaly. Usually one or two account for >80% of the spike.

Cost lineage

Walk back from the resource to the service. "This S3 bucket is owned by service X." Tags, inventory, or naming conventions provide the link.

Walk back from the service to the team. "Service X is owned by team Y."

The lineage is the routing. The team that owns the service should see the ticket.

Ticket vs no-ticket

Some anomalies are explained: a planned data ingestion, a scheduled batch, a new feature launch. Match the anomaly's timing against known activities; if it matches, no ticket.

Unexplained anomalies get a ticket. Includes the anomaly summary, the resource breakdown, the team owner, the recommended next step.

False positive rate matters. Tickets that the team closes as "expected" make the agent less trusted over time. Tune the matching against known activities.

Cost summary to Slack

Daily summary in the cost-monitoring channel: total spend, week-over-week, anomalies detected, tickets filed.

The summary is a 3-line message, not a wall of text. Engineers scan; they do not read.

Click-through to a fuller dashboard for those who want detail.

Learning from explanations

When a team explains an anomaly ("this was a planned data load"), the agent records the explanation.

Future anomalies that match the same pattern (same resource, similar timing) auto-suppress. The team explains once; the agent remembers.

Suppression has a TTL: 90 days. After that, the agent re-asks. Patterns shift; explanations expire.