AI Progress Is Accelerating Faster Than Anyone Expected

There's a chart making the rounds that's giving people vertigo. It comes from METR, a nonprofit that measures AI progress by having models complete the same tasks human experts tackle. The latest data point—from Claude Opus 4.6—shows something that even seasoned AI watchers find unsettling.

The vertical axis measures hours of human expert labor. Not how long the AI takes to complete a task, but how much human work it replaces. When Claude 4.5 launched, it could handle tasks that took a human expert about five hours. That was enough to make people nervous. Then 4.6 landed at 14.5 hours—nearly two full workdays.

Here's what makes this worth paying attention to: the acceleration itself is accelerating. For a while, AI capabilities were doubling every seven months, which would have been significant enough. But looking at data from 2023 onward, that doubling now happens roughly every 123 days. Every four months.

What the People Building This Stuff Are Saying

Sam Altman, speaking just days ago: "The world is not prepared. We're going to have extremely capable models soon. It's going to be a faster takeoff than I originally thought."

Dario Amodei from Anthropic says 100% of software engineering tasks at his company are now done by AI models. Not 80%. Not mostly. All of it. And he's noting that progress is moving faster than even his predictions—predictions that people mocked as too aggressive.

The creator of Claude Code says coding is "solved." Altman echoes this: "The way I learned to write software is now effectively completely irrelevant."

I've been covering tech long enough to recognize the pattern when everyone suddenly sounds a little shellshocked. These aren't marketing people hyping products. These are the people closest to the technology, and they're recalibrating their expectations in real-time.

What This Actually Means

Wes Roth, who created the video analyzing this data, describes his experience with Claude Opus 4.6. He gave it months of neglected accounting work while he played video games. In 30-40 minutes, the model completed the task and then—here's the part that matters—built a system to automate that same work going forward. It created a database. It established processes. The tedious reconciliation he'd been avoiding wasn't just done; it was solved permanently.

That's the piece that pure benchmarks miss. These aren't one-off completions. They're systems that keep running. Roth's news aggregator site runs 24/7 now, ranking stories, pulling feeds, maintaining itself. He set it up in one night while sleeping. "It did it while I was sleeping. I woke up. I was hoping it would work through the night. It didn't. It just completed everything in 4 hours."

The Printing Press Analogy

Roth makes a comparison to scribes and the printing press. Before printing, you either had access to someone who could write or you didn't. After, literacy became universal. That didn't mean everyone became a great writer—there's still a distribution of talent—but the baseline shifted completely.

Coding might be following that path. The skillset won't be "can you code" but "can you build good software using AI tools." Just like no one calls themselves a scribe anymore, maybe in the future no one will call themselves a coder. They'll be builders. How they achieve that building—whether through direct coding or directing AI agents—becomes less definitional.

That's the optimistic read. The other read is that we're watching a category of specialized knowledge become democratized faster than the labor market can adjust.

The Measurement Problem

Fair criticism: those data points on the METR chart aren't actually points. They're confidence intervals. Claude 4.6's 14.5 hours could range from 6 to 98 hours depending on the specific tasks. That's a massive spread. Even at the low end—six hours—that's transformative. At the high end, you're talking weeks of human labor.

Inolua Deborah Raji from UC Berkeley points out that time spent doesn't necessarily correlate with difficulty. Some things that take humans a long time might be trivial for AI, and vice versa. That's true. But if a task that used to require five hours of human labor no longer does, the philosophical question of "difficulty" matters less than the economic reality of demand for that labor.

The counterargument you hear is that models might get better at coding but won't "magically" improve at everything else. (Side note: watch for the word "magically" in AI debates—it's often doing a lot of work.) But coding isn't just coding. It's a proxy for logical reasoning, planning, debugging, system design. Those capabilities tend to correlate.

What Nobody Knows

We keep hearing about bottlenecks that will slow this down. Data exhaustion. Training collapse. Chip shortages. Energy constraints. Water usage. Some of these are real constraints. But the smartest people with the most resources are working to solve them. And historically, when that happens, the bottlenecks tend to give way faster than the pessimists predict.

Sydney von Arx from METR staff has the right framing: "You should absolutely not tie your life to this graph, but also I bet that this trend is going to hold."

That's about where I land. I'm not betting my mortgage on exponential curves continuing forever—I've seen too many S-curves in my career to believe in infinite exponentials. But I'm also not seeing evidence that this particular curve is about to flatten. The next few months will tell us whether Claude 4.6 was an outlier or a new baseline.

What I do know: the gap between "AI can barely do this" and "AI routinely does this" is compressing. The time between those states used to be measured in years. Now it's measured in months. If that continues, a lot of assumptions about how work gets done will need updating sooner than most people's strategic plans account for.

Mike Sullivan is Buzzrag's Technology Correspondent