Monitoring Platform RFP 2026
A vendor-neutral RFP for monitoring and observability platforms, fifty questions, leave-blank cells that catch bluff, and the categories that actually matter at $100k, $500k, and $2M deal sizes.
Why monitoring RFPs are different
Monitoring RFPs differ from generic AIOps RFPs because the data plane is more important than the analytics. The platform that can ingest 12TB/day reliably is the one that survives Black Friday; the platform with the slickest dashboards but unreliable ingestion is the one that breaks during your worst incident.
The buyers who get burned on monitoring platforms are the ones who scored on dashboard quality and forgot to ask about ingest reliability, query performance under load, and data retention costs at scale. The vendors know this, which is why most monitoring demos focus on visualisations and avoid the engine.
This RFP inverts the priorities, half the questions are about the data plane, a quarter are about cost at scale, and only a quarter are about the surface UX. The right platform usually loses on visualisation polish and wins on operational reliability. That's the trade you want.
The seven categories
- Ingestion & data plane (10 questions). Throughput, retention, sampling, dropped-data behaviour, agent reliability.
- Query performance & cardinality (8 questions). P95 query latency at your dataset size, cardinality limits, the cliffs.
- Alerting & correlation (7 questions). Rule complexity, anomaly detection quality, false positive baselines.
- Integrations & open standards (6 questions). OpenTelemetry support, Prometheus compatibility, log format flexibility.
- Pricing & cost engineering (8 questions). Per-host vs. per-GB, retention tiers, the overage cliffs, audit access to usage.
- Security, residency, compliance (6 questions). SOC 2 II, ISO 27001, data residency, BYOK, audit logs.
- Implementation & support (5 questions). Onboarding speed, support tiers, named CSM threshold.
Fifty questions worth asking
A representative sample. The full template ships as a spreadsheet, these are the questions vendors hate to answer.
Ingestion. "Provide your written committed ingest throughput per agent and per region. Include the behaviour when committed throughput is exceeded, drop, sample, queue, fail. Provide the most recent 12 months of incidents where ingestion was degraded for any customer." Datadog and New Relic publish status pages; Splunk's behaviour at edge is more opaque. The honest answer separates them.
Cardinality. "What's the per-metric cardinality limit on your platform at our subscription tier? What happens when we exceed it? Is there a billable overage or are queries silently downsampled?" Most monitoring buyers don't know cardinality limits exist until their queries return wrong data.
Pricing. "Provide the all-in three-year pricing for our committed scale (X hosts, Y TB/day, Z users, A custom metrics, B containers). Include every overage, premium feature unlock, and 'standard' professional services line item. Lock the price in writing for 90 days." The 90-day quote validity is the discipline that prevents end-of-quarter renegotiation.
Open standards. "Does your platform accept OpenTelemetry-native data without translation? Provide a configuration example. What features stop working if we ingest via OTel only?" Vendors who require their proprietary agent should explain why; usually the answer is feature parity issues their roadmap will close "soon."
Audit access to usage. "Can our team see real-time usage data per host, per service, per metric, granular enough to attribute spikes to specific deployments? Provide a screenshot." Without granular usage visibility, optimisation is guesswork.
The leave-blank trick
One technique that improves the signal from monitoring RFPs, include three questions where the only acceptable answer is "we don't support this," and watch which vendors fill them in anyway.
Examples. "Confirm that your platform supports negative-cost ingestion for compliance archival." (No platform does.) "Confirm that your AI predicts incidents 24 hours before they occur with 95% accuracy." (No platform does, with anywhere near that accuracy.) "Confirm that your alerting engine guarantees zero false positives." (Impossible.)
Vendors who answer "yes" to these are signalling they didn't read carefully, or worse, they're agreeable in writing for any feature, which means later answers carry less weight. Vendors who answer "no, this isn't a real capability" pass the integrity check.
Vendors this is built to compare
The 2026 monitoring landscape has clear segments. Datadog and New Relic dominate the per-host SaaS market for mid-size companies. Splunk dominates security-adjacent and enterprise log analytics; the new Cisco-era roadmap matters here. Grafana Cloud and Grafana on-prem dominate the open-source-friendly tier. Honeycomb leads on observability-first depth (events, traces, exemplars). Sumo Logic and Logz.io occupy the mid-market log space.
The RFP works for all of them. Where it surfaces differentiation: ingestion reliability favours Datadog and Grafana; cardinality at scale favours Honeycomb; cost predictability favours Grafana on-prem; ecosystem breadth favours Datadog. Cost at the high end usually punishes whoever has the highest unit price, a moving target.
Running the process
Three to five vendors. Fewer wastes leverage; more wastes time. The right shortlist is two incumbents you might switch from, two challengers you're seriously considering, and one open-source-friendly option to anchor the pricing conversation.
Four-week cycle. Week one: send RFP, run kickoff calls. Week two: receive responses, run structured demos with identical scripts (same data, same questions, same observers). Week three: reference calls and pricing finalisation. Week four: scoring meeting and decision.
The single discipline that improves the outcome is having one observer attend every demo. The notes-comparison after the fact is where most of the actual scoring happens. Without a constant observer, every evaluator is comparing different demos against different baselines, and the rubric becomes noise.
End the process with a written justification, three paragraphs, why the chosen vendor won, where the runner-up was stronger, what we'll renegotiate at year two. The document goes to the buying committee and to the future-you who handles the renewal. Without it, year-two renewal becomes a re-evaluation from scratch, which is exactly what the vendor wants.