Prompt Version Control: The Discipline That Pays Off

Prompts are code. Version them, review them, test them. The git workflow for prompts and the eval gate that protects every change.

Prompts in git

Treating prompts like code starts with version control. Each prompt lives in its own .md or .txt file in the repo: versioned, reviewable, diffable. PRs that change prompts get the same review as PRs that change code; tag prompt versions with a release identifier so the model invocation logs the version and debugging is reproducible.

Eval gate on every PR

An eval gate is what makes prompts compound in quality. Every prompt PR runs the eval suite (pass: merge proceeds; fail: PR stays open until prompt or eval is fixed); override is allowed but written (“Accepting eval regression on case-12 because new prompt fixes case-37 which is more important”) and logged. Without the gate, prompts drift; with it, they compound.

What to put in the prompt vs in code

The split between prompt and code is opinionated. Prompt: reasoning steps, format, constraints expressible in language. Code: routing, validation, deterministic logic, tool calls. When in doubt, push toward code because code is testable and prompts are stochastic; the discipline pays.

Rollback when something regresses

Rollback discipline closes the loop. Production logs the prompt version per request so a regression is traceable to the prompt change that caused it; rollback is a single PR that reverts the offending change (fast, reversible); post-rollback, write the eval case that would have caught the regression so future regressions are loud.