AI's Bitter Lesson: Reinvention or Repetition?
Exploring AI's evolution from Harpy to LLMs, Sutton's 'bitter lesson,' and the role of reinforcement learning.
Written by AI. Mike Sullivan
January 31, 2026

Photo: Welch Labs / YouTube
In 1971, while Led Zeppelin was busy redefining rock and roll, the U.S. government was quietly setting the stage for another kind of revolution—speech recognition. Enter Harpy, an AI system that could recognize a whopping 1,011 words with 95% accuracy, thanks to a giant knowledge graph packed with linguistic rules. It was the pinnacle of human ingenuity, right until it wasn't.
Fast forward a decade, and Harpy's meticulously crafted knowledge graph was shelved like an old vinyl record, replaced by hidden Markov models. These newfangled models could learn from data, rather than relying on human-crafted grammar, and scaled much more efficiently. This shift was part of a broader lesson, one that AI pioneer Richard Sutton dubbed "the bitter lesson." According to Sutton, "general methods that leverage computation are ultimately the most effective, and by a large margin."
But wait, there's more. In 2019, Sutton's essay dropped like a surprise album, just as OpenAI introduced GPT-2, sparking a new wave of enthusiasm for large language models (LLMs). These models, trained with massive computational power, seemed to align perfectly with Sutton's bitter lesson—or did they?
Sutton later clarified that LLMs might actually be a negative example of his principle. Despite their computational prowess, they heavily lean on human-generated text, akin to Harpy's dependency on human knowledge. "We want AI agents that can discover like we can, not which contain what we have discovered," Sutton argues.
This brings us to the role of reinforcement learning (RL), a method that allows AI to learn from experience, much like how we learn not to touch a hot stove. Google's DeepMind demonstrated RL's potential with AlphaGo, a Go-playing AI that didn't just mimic human strategies but invented its own, playing like an "alien from an alternate dimension."
Despite its success in controlled environments like games, RL's application in the broader world remains a question mark. Sutton and David Silver have posited that we're on the brink of a new era, where AI learns from real-world interactions rather than human input. But let's be honest—until AI can navigate the DMV without a meltdown, the jury's still out.
So where does this leave us? It seems we're caught in the age-old tech cycle of reinvention and repetition. Will LLMs become another Harpy, limited by their reliance on human knowledge? Or will they evolve, leveraging RL to break free from their current constraints?
In the words of The Who: "Meet the new boss, same as the old boss." Or maybe not. Only time, and perhaps a little bit of machine learning, will tell.
— Mike Sullivan
Watch the Original Video
Can humans make AI any better?
Welch Labs
23m 39sAbout This Source
Welch Labs
Welch Labs is a YouTube channel boasting over 832,000 subscribers, dedicated to explaining the complexities of artificial intelligence (AI) since 2024. By focusing on topics such as AI speech recognition, reinforcement learning, and large language models, the channel caters to an audience eager to explore the depths of AI development and its implications.
Read full source profileMore Like This
Claude's New Projects Feature: Context That Actually Sticks
Anthropic adds Projects to Claude Co-work, promising persistent context and scheduled tasks. Does it deliver or just rebrand existing capabilities?
AGI's Next Step: Poolside's Malibu Agent in Action
Explore Poolside's Malibu Agent, bridging AI and human intelligence in high-stakes environments.
Supermicro's Blade Servers Pack 120 Nodes in a Rack
Supermicro's SuperBlade systems promise extreme density and 95% cable reduction. Here's what that actually means for data centers.
Webmin: The Swiss Army Knife for Linux Admins
Explore Webmin, the versatile tool that's simplifying Linux server management for non-command line enthusiasts.