Rich Sutton 的《苦涩教训》：AI 史上最有影响力的 1500 字 | The Bitter Lesson: AI's Most Influential 1,500 Words

一句话总结： 70 年 AI 研究最大的教训是——利用算力的通用方法，终将碾压人类精心设计的领域知识。 TL;DR: The biggest lesson from 70 years of AI research: general methods that leverage computation will ultimately outperform approaches built on human domain knowledge.

AI chess piece overshadowing a human grandmaster — computation beats human expertise | Source: Benjamin Li 算力碾压棋艺——Sutton 用国际象棋证明：暴力搜索赢了人类精心设计的策略 Computation crushes craft — Sutton uses chess to prove: brute-force search won over human-designed strategy

起因：一封改变 AI 方向的邮件 | The Essay That Changed AI's Direction

2019 年 3 月 13 日，强化学习之父 Richard S. Sutton（2024 年图灵奖得主）在自己的个人网站 incompleteideas.net 上发了一篇约 1,500 字的文章。没有同行评审，没有正式发表。只是一封公开信。

但这篇短文后来成为 AI 领域被引用最多的文章之一，Google Scholar 上已有数百次正式引用。它甚至有了自己的维基百科词条——"Bitter Lesson"。

On March 13, 2019, the godfather of reinforcement learning Richard S. Sutton (2024 Turing Award winner) posted a ~1,500-word essay on his personal website incompleteideas.net. No peer review, no formal publication. Just a public letter.

Yet it became one of the most cited essays in AI, with hundreds of formal citations on Google Scholar. It even got its own Wikipedia entry — "Bitter Lesson."

核心论点：一句话 | The Core Argument in One Sentence

The biggest lesson that can be read from 70 years of AI research is that general methods that leverage computation are ultimately the most effective, and by a large margin.

70 年 AI 研究最大的教训是：利用算力的通用方法最终最有效，且优势巨大。

Sutton 的论证很直接：

短期 vs 长期 — 人类知识在短期有效，但长期会被算力碾压
摩尔定律是底层驱动力 — 算力成本持续指数级下降，这一趋势不可逆
两个可扩展的通用方法 — 搜索（search）和学习（learning）
不要把我们"以为的思考方式"嵌入系统 — 世界的复杂性远超我们的抽象能力

Sutton's argument is direct:

Short-term vs long-term — Human knowledge works short-term, but gets crushed by computation in the long run
Moore's Law is the driver — Computation cost drops exponentially, and this trend is irreversible
Two scalable general methods — Search and Learning
Don't embed "how we think we think" — The world's complexity far exceeds our ability to abstract it

五个历史案例 | Five Historical Cases

Sutton 用五个领域的历史数据支撑他的论点：

领域	人类知识路线	算力路线	结果
国际象棋	特级大师策略	暴力搜索 (Deep Blue)	✅ 搜索赢
围棋	人类棋理、棋形	搜索+自对弈 (AlphaGo)	✅ 搜索+学习赢
语音识别	音素、声道模型	HMM → 深度学习	✅ 统计方法赢
计算机视觉	边缘检测、SIFT	卷积神经网络	✅ 深度学习赢
NLP	规则语法、知识图谱	统计方法 → Transformer	✅ 大数据方法赢

Domain	Human Knowledge Path	Computation Path	Winner
Chess	Grandmaster strategies	Brute-force search (Deep Blue)	✅ Search
Go	Human game theory, patterns	Search + self-play (AlphaGo)	✅ Search + Learning
Speech Recognition	Phonemes, vocal tract models	HMM → Deep Learning	✅ Statistical Methods
Computer Vision	Edge detection, SIFT	Convolutional Neural Networks	✅ Deep Learning
NLP	Rule-based grammar, knowledge graphs	Statistical methods → Transformer	✅ Big Data

Sutton 的原话很锋利：

"Researchers always tried to make systems that worked the way the researchers thought their own minds worked... but it proved ultimately counterproductive, and a colossal waste of researcher's time."

"研究者总是试图让系统按照自己以为的思考方式工作……但这最终被证明是适得其反的，是对研究者时间的巨大浪费。"

为什么"苦涩"？| Why "Bitter"?

因为它否定了 AI 研究者最珍视的信念——人类智能的独特性。

Sutton 不是说人类知识没用。他说的是：在长期竞争中，它会输给算力。

This is bitter because it denies AI researchers' most cherished belief — the uniqueness of human intelligence.

Sutton isn't saying human knowledge is useless. He's saying: in the long run, it loses to computation.

"The eventual success is tinged with bitterness, and often incompletely digested, because it is success over a favored, human-centric approach."

"最终的成功带着苦涩的味道，而且常常不被完全接受，因为这是一种战胜了人类中心主义路线的成功。"

2019–2026：预言成真 | 2019–2026: The Prophecy Fulfilled

GPT 系列：Sutton 论点的终极证据

Sutton 写这篇文章时，GPT-2 刚发布不久。那时的语言模型还是学术圈玩具。

七年后的 2026 年，LLM 已经成为基础设施。Transformer 架构——一个几乎没有人类先验知识的通用架构——通过疯狂堆算力和数据，做到了：

自然语言理解与生成
代码编写
数学推理
科学发现假设生成
蛋白质结构预测辅助

When Sutton wrote this essay, GPT-2 had just been released. Language models were still academic toys.

By 2026, LLMs have become infrastructure. The Transformer — a general architecture with almost no human priors — through massive computation and data, achieved:

Natural language understanding and generation
Code writing
Mathematical reasoning
Scientific hypothesis generation
Protein structure prediction assistance

AlphaFold：从"不可能"到"已解决"

蛋白质折叠问题困扰生物学 50 年。DeepMind 没有嵌入任何生物化学公理，而是让模型从数据中学习——结果 AlphaFold 3 直接解码了蛋白质折叠。

Protein folding puzzled biology for 50 years. DeepMind didn't embed any biochemical axioms — instead, they let the model learn from data. AlphaFold 3 cracked protein folding.

但故事没有这么简单 | But It's Not That Simple

质疑一：LLM 真的是"纯算力"胜利吗？

2026 年 ICLR Blog Posts 上发表了一篇重要论文 "The Human Knowledge Loophole in the 'Bitter Lesson' for LLMs"，提出了一个尖锐的问题：

LLM 的训练数据本身就是人类知识的产物。

"In LLMs, scaling exacerbated rather than eliminated our dependence on handcrafted knowledge."

"在 LLM 中，规模化加剧而非消除了我们对人工设计知识的依赖。"

Sutton 自己也在 Dwarkesh Podcast 上承认了这个悖论：

"Will they reach the limits of the data and be superseded by things that can get more data just from experience rather than from people?"

"它们会不会在达到数据极限后，被那些能从经验而非人类获取数据的系统所取代？"

质疑二：算力的物理极限

训练一个前沿模型消耗的能量堪比一个小国家。互联网文本已接近耗尽。合成数据正在产生诡异的幻觉。

The energy to train a frontier model rivals a small nation. Internet text is nearly exhausted. Synthetic data is producing eerie hallucinations.

一些研究者开始质疑：我们是否已经触及了 Sutton 定律的边界？

质疑三：科学领域需要"第一性原理"

2025 年 Medium 上的一篇重要文章 "The bitter lesson and its discontent" 提出了一个关键区别：

AI 研究智能本身 — 人脑极其复杂，没有统一理论，Sutton 的论点成立
物理和生物科学 — 我们有第一性原理（量子力学、热力学、生化反应），只是算不过来

"In the physical and biological sciences, the limitation is not a lack of first principles; it is our ability to follow complex systems deeply and long enough."

"在物理和生物科学中，限制不是缺乏第一性原理；而是我们无法足够深入、足够长时间地追踪复杂系统。"

这篇文章提出了 "跨尺度模型"（trans-scale models） 的概念——融合 AI 模式识别能力和第一性原理物理学的混合模型。

Sutton 的最新立场 | Sutton's Current Position

有意思的是，Sutton 本人并没有停留在 2019 年的论点上。

2025 年，他在多个场合表达了对纯 LLM 路线的担忧：

LLM 可能已经走入死胡同 — 过度依赖人类数据，缺乏内在动机
真正的突破需要"从经验中学习" — 类似 AlphaGo 的自对弈，而非从互联网抓取文本
持续学习（continual learning）是关键 — 当前模型训练完就固定了，不像人类能持续学习

2026 年，一篇 NeurIPS 提交的论文 "From Bitter to Better Lessons in AI" 甚至论证：

人类专业知识应该被视为"数据"而非"规则"——这既尊重了 Sutton 的计算可扩展性论点，又保留了人类知识的价值。

Interestingly, Sutton himself hasn't stayed stuck at his 2019 position.

In 2025, he expressed concerns about the pure LLM path:

LLMs may have reached a dead end — Over-reliance on human data, lacking intrinsic motivation
True breakthroughs require "learning from experience" — Like AlphaGo's self-play, not scraping the internet
Continual learning is the key — Current models are frozen after training, unlike humans who keep learning

In 2026, a NeurIPS submission "From Bitter to Better Lessons in AI" argued:

Human expertise should be treated as "data" not "rules" — honoring both Sutton's scalability argument and the value of human knowledge.

启示：从 Sutton 的教训看材料研发 | Lessons for Materials R&D

Sutton 的论点不仅适用于 AI，对材料科学和化学研发同样有深刻启发：

1. 别试图"把化学家装进公式"

材料配方优化，传统做法是依靠专家的经验直觉来缩小搜索空间。但 Sutton 的教训告诉我们：在高维空间中，人类直觉是误导性的。

应用方向： 用高通量实验（HTE）+ 机器学习做配方筛选，而不是靠专家的"我觉得这个方向对"。数据量比直觉重要。

Don't try to "put the chemist into equations." Traditional materials formulation optimization relies on expert intuition to narrow the search space. But Sutton's lesson tells us: in high-dimensional spaces, human intuition is misleading.

Application: Use high-throughput experimentation (HTE) + machine learning for formulation screening, not experts' "I think this direction is right." Data volume beats intuition.

2. 算力是长期竞争优势

材料研发的投入，如果只是在"人脑经验"层面优化，天花板很低。但如果建立了算力驱动的筛选平台，随着算力成本下降，这个平台会越来越强。

Computation is a long-term competitive advantage. If materials R&D investment only optimizes at the "human experience" level, the ceiling is low. But if we build a computation-driven screening platform, it gets stronger as compute costs drop.

3. 但别忘了"第一性原理"

Sutton 的论点在"寻找智能"领域成立。但材料科学是物理化学问题——我们有量子力学、热力学、高分子化学的第一性原理。最优路径是混合的：算力驱动搜索 + 物理约束指导。

But don't forget first principles. Sutton's argument holds for "finding intelligence." But materials science is a physical chemistry problem — we have quantum mechanics, thermodynamics, polymer chemistry as first principles. The optimal path is hybrid: computation-driven search + physics-constrained guidance.

金句摘录 | Quote Collection

来源	金句
Sutton (2019)	"Building in how we think we think does not work in the long run."
DeepMind (2022)	"Generic models better at leveraging computation tend to overtake specialized approaches, eventually."
Sutton (Dwarkesh)	"Will they be superseded by things that can get more data just from experience rather than from people?"
Huafeng Xu (2025)	"Our cleverness, when codified, can become its own bottleneck."
ICLR Blog (2026)	"Scaling exacerbated rather than eliminated our dependence on handcrafted knowledge."

最后：苦涩还是甜蜜？| Final: Bitter or Sweet?

Sutton 的苦涩教训之所以苦涩，是因为它要求我们放弃一种深层的心理需求——相信自己的智慧是特别的。

但换一个角度想：如果我们不是被"设计"出来的，而是被"发现"出来的呢？

搜索和学习——这两个 Sutton 认为可以无限扩展的通用方法——恰恰也是人类智能的核心机制。我们通过探索（搜索）和经验（学习）成长，而不是通过被预先编程。

也许苦涩的不是算力赢了人类，而是我们一直拒绝承认：自己本来就是算力和学习的产物。

The bitter lesson is bitter because it asks us to abandon a deep psychological need — the belief that our own intelligence is special.

But consider it from another angle: what if we weren't "designed" but "discovered"?

Search and learning — the two general methods Sutton believed scale infinitely — are exactly the core mechanisms of human intelligence. We grow through exploration (search) and experience (learning), not through pre-programming.

Perhaps the bitterness isn't that computation beat humanity — it's that we've been refusing to admit: we ourselves are products of computation and learning.

Published: 2026-05-29 | Author: Benjamin Li

参考资料 | References:

Sutton, R. (2019). "The Bitter Lesson" — http://www.incompleteideas.net/IncIdeas/BitterLesson.html
Wikipedia. "Bitter lesson" — https://en.wikipedia.org/wiki/Bitter_lesson
nat.io (2025). "The Bitter Lesson in AI: Computation vs. Human Design in 2025"
Huafeng Xu (2025). "The bitter lesson and its discontent" — Medium
ICLR Blog Posts (2026). "The human knowledge loophole in the 'bitter lesson' for LLMs"
NeurIPS 2025 submission. "From Bitter to Better Lessons in AI: Embracing Human Expertise as Data"
NextBigFuture (2025). "AI Legend Sutton Gives His Suggestions for True Continual Learning"