Connecting an Agent to GitHub for Runbook Updates
After every incident, the agent proposes a runbook update as a PR. The PR template, the review flow, and the metrics that prove this is actually compounding.
Runbook updates as PRs
After every incident the agent handled, propose a runbook update as a PR. The PR captures what was learned: new symptom, new diagnostic step, new remediation. PRs are small (a few lines added, often one or two new sections, reviewer approves in minutes); the compounding effect is what matters because the runbook gets better and future agent runs handle more cases correctly.
- Per-incident PR. After every agent-handled incident; the learning becomes a tracked artifact.
- Small PRs. Few lines, one or two sections; reviewer approves in minutes.
- Compounding effect. Runbook gets better; future agent runs handle more cases.
- Per-PR provenance link. The PR links the incident; supports audit and learning.
PR template
The PR template is structured. Title: “[Agent] Update X runbook based on incident Y”. Body: incident summary, what the agent did, what was learned, what is being added to the runbook. Files changed: the runbook file with the addition. Tags: agent-generated, ready-for-review.
- Structured title. “[Agent] Update X runbook based on incident Y”; reviewable at a glance.
- Four-part body. Incident summary, agent actions, lesson learned, runbook addition.
- Single file changed. The runbook file; minimal blast radius.
- Tags for triage. agent-generated, ready-for-review; supports filter and prioritisation.
Review flow
The review flow is fast. Reviewer is the runbook owner who approves, modifies, or rejects; most agent-generated PRs are approved with minor edits; rejections happen when the agent misunderstood the incident and the rejection is logged and feeds back into agent improvement; approval merges and triggers a downstream sync to the runbook display tool.
- Owner reviews. Runbook owner approves, modifies, or rejects.
- Most approved with minor edits. The agent gets close; humans polish.
- Rejections feed agent improvement. Logged as training signal; the agent gets better.
- Merge triggers display sync. Approval propagates to the runbook display tool.
Metrics that prove this works
Three metrics prove the system. PRs filed per week (started at 0, target 5-10 in a healthy team); approval rate (target greater than 70%, lower means the agent misjudges and higher means the agent is too conservative); agent handle-rate of incidents covered by recent runbook updates (should improve over time, the proof of compounding).
- PRs filed per week. Target 5-10 in a healthy team; starts at 0.
- Approval rate > 70%. Lower means agent misjudges; higher means too conservative.
- Agent handle-rate improves. Incidents covered by recent runbook updates; the compounding proof.
- Per-quarter trend dashboard. Trends visible to the team; supports continued attention.
Guardrails
Three guardrails prevent the agent from flooding. Branch limit: at most 5 open PRs from the agent at once (prevents flood); rate limit: at most 1 PR per runbook per week (prevents churn on a single document); stale-PR cleanup: PRs older than 14 days without review are auto-closed and the agent re-files if the lesson is still relevant.
- 5 open PR limit. Prevents flood; the agent waits when full.
- 1 PR per runbook per week. Prevents churn on a single document.
- 14-day stale cleanup. Auto-closed; agent re-files if still relevant.
- Per-guardrail measured. Each limit observed; tuning is data-driven.