Learning the Bitter Lesson in 2026

5 3 minutes read

The Bitter Lesson, as outlined by Richard Sutton in his 2019 article, delves into the fundamental truth that progress in artificial intelligence (AI) arises from the relentless scaling of computation rather than reliance on specialized human expertise. Sutton’s thesis challenges the prevailing notion that intelligence advancements are driven by embedding human knowledge into AI systems. Instead, he posits that breakthroughs in AI stem from scaling models, making them larger, and training them on extensive datasets with increased computational power.

Drawing from examples in AI history, Sutton highlights the superiority of methods that scale with computation over those based on human expertise. For instance, in the realm of game-playing AI, the success of AlphaZero in mastering chess and Go through self-play without human input underscores the power of scale and computation. Similarly, advancements in natural language processing (NLP) and computer vision have showcased the efficacy of generic architectures trained on vast datasets with significant compute power.

The economic implications of Sutton’s Bitter Lesson are significant, especially amidst the ongoing AI revolution. The unprecedented mobilization of financial resources into AI research and development has accelerated the pace of advancement. The exponential growth in private AI investments, projected to exceed $100 billion annually, underscores the industry’s commitment to scaling AI capabilities.

However, the futurist predictions surrounding AI’s impact on society range from optimistic visions of a world saved by AI to doomsday scenarios of existential threats. While AI capabilities have been rapidly evolving, there is no guarantee that this trajectory will continue indefinitely. Some observers have noted potential limitations in AI advancement, with concerns about hallucinations persisting in advanced models and the possibility of reaching a plateau in AI capabilities.

Recent economic research, exemplified by economist Joshua Gans’ concept of “artificial jagged intelligence,” offers a nuanced perspective on AI’s performance. Gans’ model suggests that while increasing scale improves average performance in AI systems, it also introduces jaggedness, leading to uneven performance across tasks. This unevenness poses challenges for users who rely on AI for specific tasks, as high global quality signals may not always translate to local reliability.

As AI adoption continues to shape various industries, understanding the implications of scaling in AI development is crucial for navigating the complex economic landscape. The Bitter Lesson serves as a guiding principle, emphasizing the importance of intellectual humility and the recognition that true progress in AI stems from scaling computation rather than hardcoded expertise. By leveraging the lessons from Sutton’s thesis and incorporating insights from ongoing research, stakeholders can better navigate the evolving landscape of AI technology and its economic implications. In a recent paper by Gans, he explores the concept of errors in AI systems and how they are amplified by the “inspection paradox”. This paradox means that users often encounter errors exactly where they need assistance the most. This insight sheds light on a structural limitation that persists even when following the Bitter Lesson path, as proposed by Sutton. While scaling AI models can improve performance, it does not eliminate the unpredictability that comes with using these systems.

Gans’ research suggests that businesses cannot solely rely on benchmark performance when adopting AI. Instead, they must invest in human oversight and domain-specific testing to address the persistent unpredictability of scaled AI models. This also implies that AI will not replace human jobs entirely, as human insight and expertise are still crucial for maximizing the utility of AI systems.

While Sutton’s idea of scaling AI models is valuable, it is important to recognize that scaling alone is not sufficient for achieving superintelligence. Models require human insight and structure to be truly effective for businesses. Techniques like Reinforcement Learning from Human Feedback (RLHF), where human evaluators provide feedback to AI systems, help inject human values into models and improve their performance.

Furthermore, there are practical constraints to consider, such as energy costs and data limits, which prevent unlimited scaling of AI models. To enhance AI capabilities, efficiency and algorithmic cleverness are needed in addition to brute force scaling. Human expertise remains essential in shaping and steering scaled learning systems, shifting from encoding intelligence directly to guiding the development of AI models.

In conclusion, while scaling AI models can improve performance, human ingenuity and expertise are indispensable for progress in the field. Gans’ work highlights the economic implications of AI adoption, emphasizing the need for institutions and human expertise to manage the unpredictable nature of AI systems. The bitter lesson may be that scaling alone is powerful, but human ingenuity is the key ingredient for driving advancements in AI technology.

Overall, the integration of human insight and expertise with scaled AI models is essential for achieving optimal performance and maximizing the benefits of AI technology in various industries.