The Memory-Pressure Investigation Agent: A Case Study

From a single OOM page to root cause in 9 minutes. The exact prompts, tool calls, and decisions an agent made, with where it nearly went wrong.

Starting from the OOM page

The memory-pressure agent starts from a structured OOMKilled alert. The first two actions are mechanical and bound the search space before reasoning begins.

The reasoning path

The memory shape selects the reasoning branch. Three shapes cover almost every OOM cause cleanly.

9-minute resolution

The agent budgets nine minutes end to end. The pacing below keeps the run within the on-call’s patience window while leaving time for human approval.

Where the agent nearly went wrong

Three near-misses shaped the prompt. Each surfaced a real failure mode the eval set now covers.

Trust earned

Trust is the long-tail value of an operating agent. The first 90 days build it; the next 180 days harvest it.