Performance Budgets as Engineering Discipline

Performance budgets ratchet down or hold. Without them, performance degrades quietly with every release.

Why budgets prevent drift

Performance regressions are death by a thousand papercuts. Each PR adds a tiny amount; nobody notices; a year later the site is slow.

Gradual decay. Each individual regression is below the noise floor; cumulatively, performance erodes.
Nobody notices. Without a budget, no specific PR triggers attention; the regression hides in plain sight.
Catch at PR time. Budgets fail the build; the PR author owns the regression; fixes happen before merge.
Forcing function. Budgets force the architectural conversation when limits are hit, not in a panic later.

Four budget categories

1. Bundle size.
2. p99 endpoint latency.
3. Memory per request.
4. CPU per request.

Per-budget threshold

Each budget category has its own threshold rule. Some ratchet down over time; others hold a baseline. The pattern matches the metric.

Bundle size. Ratchet down; never let increase without justification; the discipline forces real cleanup.
p99 latency. Hold within 5% of last release; small drift acceptable, big drift surfaces.
Memory. Hold within 10% of baseline; allocations are noisier than latency.
CPU. Hold within 10% of baseline; per-request CPU is the cost dial.

CI integration

Budgets without CI enforcement are decoration. The check has to fire before merge or the regression slips through.

CI step. Load test plus assertion in CI; bundle-size analyser plus assertion for frontend bundles.
Fail-the-PR. Budget exceeded without 'allow-regression' tag fails the build; merge blocked.
Allow-regression. Reviewer-approved escape hatch; written justification plus dated cleanup ticket.
Cleanup follow-through. Allow-regression tickets tracked; quarterly review closes them or escalates.

Antipatterns

No budgets. Drift wins.
Budgets without enforcement. Decoration.
Allow-regression as default. Defeats the discipline.

What to do this week

Three moves. (1) Apply this pattern to your slowest production endpoint. (2) Measure p99 before/after. (3) Document the win and ship the runbook so the team can reproduce.