What CI/CD is: the three things people conflate
CI/CD is the automated pipeline that takes a code change from a commit all the way to running, verified, in production. The two letters hide three distinct practices that get conflated constantly, and the confusion is not pedantic: each one implies a different level of automation, a different risk posture, and a different organizational commitment. Naming them precisely is the first step to building or maturing a pipeline that actually fits your team.
Continuous integration is the practice of merging every developer's work into a shared main branch many times a day, and proving each merge with an automated build and an automated test run. The goal is to catch integration problems within minutes of the commit that caused them, while the change is small. Continuous delivery extends that: the tested artifact is always kept in a deployable state, so any green build can be released to production with a single click whenever a human decides to. Continuous deployment goes one further and removes the human gate entirely. Every change that passes the full pipeline ships to production automatically, with no manual approval.
The first CD, delivery, is about always being ready to ship. The second CD, deployment, is about automatically shipping. They share an abbreviation, which is exactly why people use them interchangeably when they mean very different things. Almost every team should pursue continuous delivery. Whether to go all the way to continuous deployment is a deliberate, situation-specific choice covered later on this page.
Why does the pipeline matter so much? Because it is the single path every change travels on its way to your users. Make that path fast, automated, and safe, and you raise both your delivery speed and your reliability at the same time. Leave it slow and manual, and every release becomes a risky, stressful event. CI/CD is one focused part of the wider DevOps automation surface, and it sits inside the broader DevOps culture of shared ownership between the people who build software and the people who run it. This page is the deep dive on the pipeline itself.
Continuous integration in depth
Continuous integration is the foundation, and it is more a discipline than a tool. The name comes from a simple, hard-won lesson: integrating code is painful in proportion to how long you wait to do it. If ten engineers each work in isolation for two weeks and then merge, the conflicts, the incompatible assumptions, and the broken interfaces all surface at once in a miserable big-bang integration. If instead each engineer merges to main several times a day, every integration is tiny, and any problem is caught within minutes while its author still has full context.
Trunk-based development
The branching model that makes continuous integration real is trunk-based development. Everyone commits to short-lived branches cut from main, ideally living less than a day, and merges back fast. There are no long-running feature branches that drift for weeks. The opposite pattern, where each feature gets a branch that lives until the feature is fully finished, quietly defeats continuous integration: the team is technically using a CI server, but the actual integration of divergent work still happens rarely and painfully. Short-lived branches plus frequent merges are what the practice requires.
Automated builds and fast test feedback
Every merge to main triggers an automated build and an automated test run. The single most important property of that test suite is speed. If the feedback takes forty minutes, engineers context-switch away, stop watching, and start batching changes to avoid the wait, which erodes the whole discipline. The aim is fast feedback measured in minutes: run the quick unit tests first and fail loudly the moment anything breaks, then run the slower integration and end-to-end suites. Parallelizing test execution and running only the tests affected by a change are the usual levers for keeping feedback fast as the codebase grows.
Keep main green
The cultural rule that ties it together is keep main green. A broken build on main is the team's top priority, ahead of any new feature work, because while main is red nobody can safely integrate on top of it and the whole team is blocked. Mature teams enforce this with branch protection that refuses to merge a change unless the pipeline passes, so a red main becomes rare by construction rather than by heroics. The payoff of high merge frequency on a green main is that integration stops being an event and becomes a continuous, boring, low-risk background hum, which is exactly the goal.
The common failure mode. Many teams say they "do CI" because a pipeline runs on every pull request, while branches still live for a week or two before merging. That is automated testing, which is valuable, but it is not continuous integration. The integration is still infrequent and still risky. The tell is branch lifetime: if your typical branch lives longer than a day, you have a build server, not continuous integration.
Continuous delivery vs continuous deployment
Both abbreviate to CD, and the only structural difference between them is a single gate: whether a human presses the button before production. That one gate, present or absent, changes the operating model significantly.
| Dimension | Continuous delivery | Continuous deployment |
|---|---|---|
| Production gate | Human approves the release | No gate; passing change ships automatically |
| Release cadence | On demand, when the team chooses | Every green commit, continuously |
| Test coverage required | High | Very high; the tests are the only gate |
| Rollback maturity | Recommended | Mandatory and fast |
| Best fit | Regulated, change-window, high-blast-radius | Deep automation, progressive delivery in place |
| Audit and change control | Explicit human sign-off per release | Encoded in the pipeline and policy |
When continuous delivery is the right call. If you operate in a regulated environment that requires a human sign-off on each production change, if releases must land inside an approved change window, or if a single bad deploy has a very large blast radius, the human gate is a feature, not a weakness. Continuous delivery gives you the speed of an always-ready pipeline while keeping the deliberate choice of when to release in human hands.
When continuous deployment is the right call. If you have deep automated test coverage, fast and reliable rollback, and progressive-delivery safety nets so that a bad change reaches very few users before it is caught, removing the human gate removes a bottleneck without adding meaningful risk. The pipeline becomes the gate, and it is a more consistent gatekeeper than a tired human at the end of a sprint.
Release is not the same as deploy
A crucial idea that unlocks both models is that deploying and releasing are different events. Deploy means the new code is running on production servers. Release means users can actually see and use the new behavior. Feature flags separate the two: you deploy the code dark, with the new feature wrapped in a flag that is switched off, so it runs in production but is invisible to users. Later you release it by flipping the flag on, for everyone or for a percentage, with no redeploy. This separation lets even a continuous-deployment team ship code to production constantly while still controlling, carefully and reversibly, when each user-facing change actually goes live.
See how a deploy is watched, verified, and auto-rolled-back across your whole fleet.
Try Nova →Anatomy of a pipeline, stage by stage
A mature CI/CD pipeline is a sequence of stages, each of which is a gate: the change only advances if the stage passes. Below is the canonical shape. Real pipelines add, reorder, or parallelize stages, but the logical flow from commit to verified release is consistent across teams.
1Source
A commit or a merge to a watched branch triggers the run. The trigger event, the exact commit, and the person who pushed it are all captured so the run is fully traceable. This is also where pull-request checks run before code ever reaches main.
2Build
The code is compiled and packaged into a deployable artifact: a binary, a container image, or a bundle. The build must be reproducible, so the same commit always produces the same artifact. Caching dependencies keeps this stage fast as the project grows.
3Test
Unit tests run first for fast feedback, then integration tests that exercise components together, then end-to-end tests that drive the system the way a user would. Ordering fast tests first means most failures surface in seconds, not after the slow suites finish.
4Security scanning
Static analysis of the code, dependency scanning for known vulnerable libraries, container image scanning, and secret detection. These gates shift security left, catching issues in the pipeline rather than in a production incident or an audit months later.
5Artifact
The verified build is published to a registry with an immutable, versioned identity. The exact artifact that passed the tests is the artifact that gets deployed, with no rebuild in between, so what you tested is precisely what you ship.
6Deploy and verify
The artifact rolls out to staging and then production using a chosen strategy. Post-deploy health checks and observability confirm the release is healthy, and an automated rollback fires if it is not. The deploy is not done until the verify stage says the system is healthy.
Pipeline as code
The pipeline itself lives as code in the repository, alongside the application it builds. Defining the pipeline declaratively, rather than clicking it together in a web UI, gives it the same benefits the application code already has: it is version-controlled, reviewed in pull requests, diffable, and rolled back by reverting a commit. It also makes the pipeline reproducible across projects and recoverable if the CI system is rebuilt. A pipeline you cannot review in a diff is a pipeline nobody fully understands. Keeping it as code is what makes the whole delivery process auditable and maintainable.
Deployment strategies and fast rollback
How you push the new artifact into production is where the pipeline meets reliability directly. The goal is to deploy without downtime and to make rolling back fast and boring. Four strategies cover almost every case, and mature teams combine them.
Rolling deployment
Instances are replaced in batches rather than all at once, so the service stays available throughout. During the rollout the old and new versions run side by side, which keeps capacity up but requires that the two versions be compatible, especially around database schemas and API contracts. Rolling is the default in most container orchestrators because it is simple and needs no extra infrastructure, but its rollback is slower because it has to roll the batches back the way it rolled them forward.
Blue/green
Two identical production environments exist: the live one (blue) and an idle one (green). You deploy the new version to green, verify it in isolation, then flip all traffic from blue to green at once. The rollback is the fastest of any strategy because it is just flipping traffic back to blue, which is still running the previous version untouched. The cost is running two full environments, and care is needed so that in-flight requests and shared state, such as the database, are handled cleanly across the switch. This is the model the Nova marketing site itself uses for deploys.
Canary
The new version is released to a small slice of traffic first, perhaps one percent, while its metrics are watched closely. If error rates, latency, and the key business signals stay healthy, the rollout is promoted in steps to larger and larger shares until it serves everyone. If the metrics degrade, the canary is pulled and the blast radius is limited to that initial slice. Canary is the safest strategy for high-risk changes because the system itself, ideally automatically, decides whether to promote or abort based on real production signals.
Feature flags
Feature flags operate at a different layer: they decouple deploy from release entirely. The code ships dark behind a flag, and you turn the feature on with a configuration change, for everyone or a percentage of users, with no redeploy. Rolling back a feature is flipping its flag off, which takes seconds and touches no infrastructure. Most mature teams combine canary or blue/green for the infrastructure-level deploy with feature flags for the product-level release, getting fast, low-risk control at both layers.
The unifying theme across all four is that fast rollback is the real safety net. A pipeline that ships often is only safe if a bad change can be reversed quickly, which is why automated rollback wired to health checks turns a regression from an outage into a non-event. This is the hinge that connects delivery speed to site reliability engineering, and it is where MTTR is won or lost.
CI/CD metrics and the four DORA keys
You cannot improve a pipeline you do not measure, and the most widely adopted measurement framework is DORA, which defines four key metrics for software delivery performance. The elegant thing about the four keys is that they balance throughput against stability, so you cannot game one by sacrificing the others.
| DORA metric | What it measures | Elite target |
|---|---|---|
| Deployment frequency | How often you ship to production | On demand, multiple per day |
| Lead time for changes | Commit to running in production | Less than one hour |
| Change failure rate | Share of deploys that cause a degradation | Under 15% |
| Failed deployment recovery | Time to restore service after a bad change | Less than one hour |
The first two keys, deployment frequency and lead time, measure throughput: how fast and how often you can deliver change. The second two, change failure rate and failed deployment recovery time (often expressed as MTTR), measure stability: how often your changes break things and how fast you recover when they do. The research behind DORA found that high performers score well on both pairs at once. Speed and safety are not a trade-off in a well-built pipeline; they rise together, because the same practices that make deploys frequent, small batches, strong automation, and fast rollback, are exactly what make deploys safe.
This is why pipeline quality maps so directly onto the four keys. A slow, manual pipeline lengthens lead time and depresses deployment frequency. A pipeline with weak tests and no progressive delivery raises change failure rate. A pipeline with no automated rollback lengthens recovery time. Improving the pipeline improves all four metrics, which is the most legible way to demonstrate the value of investing in CI/CD to leadership.
CI/CD meets reliability and AI
Here is the uncomfortable truth that connects the pipeline to operations: change is the single largest cause of production incidents. The pipeline is the thing that ships change. So the pipeline is also, indirectly, the thing that ships most of your incidents. A faster pipeline with no safety net does not make you more reliable; it just makes you break things faster. The reliability win comes from pairing pipeline speed with the safety mechanisms that catch a bad change before it reaches everyone.
The loop that closes the gap has three parts. Progressive delivery, canary or feature flags, ensures a regression touches a small slice of users first. Automated rollback, wired to health checks, reverses a bad deploy in seconds without a human deciding to. And observability gives the deploy something to be judged against: the metrics, logs, and traces that say whether the new version is actually healthy. Put together, a regression the pipeline introduced is detected, correlated to the deploy that caused it, and reversed before it becomes an outage. That is the difference between a pipeline that ships incidents and a pipeline that catches them.
This is exactly where an AI operations layer earns its place. A change ships through the pipeline, and Nova AI Ops watches the deploy across AWS, GCP, Azure, Linux, and Windows at once. When a regression appears, a latency spike, an error-rate jump, a saturated resource, the platform correlates it back to the specific deploy that introduced it, rather than leaving an on-call engineer to guess at 3am whether the new release is the culprit. Within a policy envelope you define, it then auto-remediates, including triggering the rollback, and escalates to a human only when the situation falls outside that envelope. The pipeline ships the change; the AI layer makes the consequence of a bad change a non-event instead of a page.
For the architecture behind that closed loop, see the guides to AI SRE, Agentic SRE, and self-healing infrastructure, and the foundations in observability and AIOps. The pipeline and the operations layer are two halves of the same delivery loop: one ships change, the other catches what the change breaks.
A 90-day plan and a 10-point checklist
Whether you are building a pipeline from scratch or maturing one that stalled at automated testing, the same sequence applies: fix the foundation, add safe delivery, then close the loop with operations. Here is a practical 90-day plan.
Days 1–30: Fix the pipeline foundation
Get build and test into one automated pipeline that runs on every commit, so main is always proven. Define the pipeline as code in the repository. Make the feedback fast enough, minutes not tens of minutes, that engineers actually wait for it, parallelizing tests and running only what a change affects. Adopt trunk-based development with short-lived branches and branch protection that refuses to merge a red build. By day 30, every change is integrated and tested continuously, and main is reliably green.
Days 31–60: Add progressive delivery and rollback
Add quality and security gates: static analysis, dependency and container scanning, and secret detection. Publish an immutable, versioned artifact so what you test is exactly what you deploy. Automate deploy to staging and then production with a safe strategy, canary or blue/green, and wire in post-deploy health checks with automated rollback. By day 60, a deploy is a routine, reversible event rather than a stressful manual ceremony.
Days 61–90: Close the loop with observability and remediation
Connect the pipeline to your observability stack so every deploy is judged against real metrics, logs, and traces. Add automated canary analysis that promotes or aborts based on those signals. Finally, wire in an operations layer that detects a regression, correlates it to the deploy that caused it, and remediates within a policy envelope, including rollback, so a bad change is caught and reversed without a human at 3am. By day 90, velocity and stability rise together rather than trading off.
The 10-point CI/CD checklist
Use this to grade an existing pipeline or to scope a new one. A pipeline that can answer yes to all ten is in excellent shape.
- Does every commit trigger an automated build and test run? If integration only happens on long-lived branches, you have automated testing, not continuous integration.
- Is test feedback fast enough that engineers wait for it? Minutes, not tens of minutes. Slow feedback quietly kills the discipline.
- Is main kept green by branch protection? A red build should block merges automatically, not depend on someone noticing.
- Is the pipeline defined as code in the repository? Reviewable, diffable, and rolled back by reverting a commit, not clicked together in a UI.
- Do security scans gate the pipeline? Static analysis, dependency and container scanning, and secret detection running on every change.
- Is the deployed artifact immutable and versioned? What you tested is exactly what you ship, with no rebuild between test and deploy.
- Do you deploy with a safe strategy? Canary, blue/green, or rolling, never an all-at-once replace with no fallback.
- Can you roll back fast and automatically? Health-gated rollback in seconds is the real safety net behind frequent deploys.
- Do you separate deploy from release with feature flags? Shipping code dark and releasing behavior independently shrinks the risky moment.
- Do you track the four DORA metrics? Deployment frequency, lead time, change failure rate, and recovery time, watched over time.
Frequently asked questions
What is CI/CD?
What is the difference between continuous delivery and continuous deployment?
What is continuous integration?
What are the stages of a CI/CD pipeline?
What are the main deployment strategies?
What is the difference between deploy and release?
What are the DORA metrics for CI/CD?
How is CI/CD different from DevOps automation?
Why is CI/CD important for reliability?
How do I build or improve a CI/CD pipeline?
Related guides
CI/CD is the focused pipeline view; go up and out from here. The broader automation surface this pipeline lives inside is DevOps automation, and the culture around it is DevOps. The sibling discipline for provisioning is infrastructure as code. On reliability foundations: site reliability engineering, AI SRE, Agentic SRE, and AIOps. On the metrics and practices a pipeline moves: MTTR, SLOs and error budgets, incident management, and self-healing infrastructure. On telemetry and operations: observability, monitoring, and chaos engineering. On the day-to-day work: eliminating toil, building runbooks, and running blameless postmortems. For teams shipping AI systems: LLMOps and the AI engineer's guide to production reliability. See it all in one place on the features overview.
See your pipeline's deploys watched and auto-remediated in real time.
Nova AI Ops is the Multi-Agent OS for SRE & DevOps. 100 specialized AI agents across 12 teams watch every deploy across AWS, GCP, Azure, Linux, and Windows, correlate regressions to the change that caused them, and roll back within a policy envelope. Free tier available for small teams.