The structured process of finding the systemic cause behind an incident, not just the surface symptom.
Root cause analysis (RCA) is the discipline of working backward from an incident's symptoms to the underlying systemic cause. Common techniques are Five Whys, fishbone diagrams, fault tree analysis, and change-correlation (what deployed in the hour before the incident). A good RCA names the systemic cause precisely enough that a team-agnostic reader could implement the fix; vague RCAs ('insufficient testing') don't drive action.
RCA is the part of incident response where the lessons get extracted. Skipping or rushing it is how the same incident pattern recurs across teams: each team finds its own surface fix, none of them understand the systemic cause, and the underlying weakness keeps producing new symptoms. AI-assisted RCA, which scans deploys, config changes, and topology shifts in the incident window, dramatically accelerates the human investigation.
See the part of the platform that handles root cause analysis (rca) in production.