AI & ML Advanced By Samson Tanimawo, PhD Published Dec 16, 2025 7 min read

The Bitter Lesson Applied in 2026

Rich Sutton’s 2019 essay said scale beats clever in AI. Six years later the lesson keeps being relearned. Here is what it has and hasn’t predicted, and where the next turn might surprise us.

The thesis, exactly

From Sutton’s 2019 essay: “The biggest lesson that can be read from 70 years of AI research is that general methods that leverage computation are ultimately the most effective.”

The implication: clever, hand-engineered approaches that bake in human insight tend to lose, over time, to simple methods that scale with available compute. Search and learning are general; everything else has a ceiling.

The supporting history

Three canonical examples Sutton cited:

The pattern: simple methods plus more compute eventually beat clever methods plus less compute. Always.

Where it still holds in 2026

The most striking recent case: language models. Researchers spent years on hand-crafted parsers, semantic networks, knowledge graphs. They were beaten by next-token prediction at scale.

Even in 2026, the pattern recurs:

Counter-evidence

The Bitter Lesson isn’t always immediate. Several places where cleverness still helps:

The honest reading: scale wins on the central challenges. Cleverness still matters at the edges and on engineering economics.

The next turn

Two ways the lesson could be wrong in the next 2 years:

Best bet for 2026-2028: the lesson keeps holding for the central capabilities. The interesting work is at the edges, the small clever things that compound at scale.