Root Cause Is a Fallacy

Categories
Systems
Sources
How Complex Systems Fail

Catastrophe in a complex system has no single root cause. It arises from multiple contributing conditions combining, none individually sufficient. Naming one "root cause" is a choice driven by the need for closure, not by the structure of the failure.

Why it Matters

Stopping at a root cause, usually "human error," prevents learning, because it ignores the many other conditions that had to align. Durable fixes address the combination and the structure, not a scapegoat.

Signals

  • Incident reviews that end at "operator error" or one bad commit.
  • A tidy single cause for a messy event.
  • The same class of incident recurring after the "root cause" was fixed.

Benefits

Deeper learning, fixes that target the system rather than a person, and fewer recurrences.

Risks

The comfort of a single cause stops investigation early; blaming the sharp-end operator hides the blunt-end conditions; "five whys" pursued as if a single chain exists.

Tensions

Organizations need actionable conclusions and accountability, which pull toward a single cause, while honest analysis resists one.

Examples

An outage blamed on the engineer who ran a command, ignoring the missing safeguard, the misleading interface, and the schedule pressure that all enabled it.