Observability Intermediate By Samson Tanimawo, PhD Published Dec 11, 2026 9 min read

Service-Level Indicators That Survive Refactors

An SLI tied to an internal function name dies the day someone refactors. Durable SLIs are tied to user-facing behaviour, not code shape.

Why SLIs rot

An SLI defined as ‘processOrder() latency at p99’ works until someone renames the function. The metric stops emitting; the SLO appears to be at 100%; nobody notices for weeks.

Refactors are a constant. SLIs that depend on code internals are coupled to that constant. Decoupling is the whole job.

Four properties of a durable SLI

Implementation patterns

Implement at the request boundary, ingress controller, service mesh sidecar, gateway. The metric exists at a layer that does not move with code refactors.

Where edge measurement is not possible (background workers), use a wrapper utility that the function calls; the wrapper emits the metric. Refactors that remove the wrapper are explicit.

Test suite for SLI drift

Add a test in CI that hits a known endpoint and checks that the SLI metric increments. If the metric stops emitting, CI fails before merge.

Add a synthetic check in production that exercises the SLI path on a known cadence. Recovery from any drift is automatic; the alert is loud.

Antipatterns

What to do this week

Three moves. (1) Inventory your SLIs; flag any defined by code-internal names. (2) Move one to edge measurement at the ingress. (3) Add the CI test that catches drift.