Anthropic's Opus 4.7: The Enterprise Model You Can't Afford
Anthropic's Opus 4.7 excels at enterprise tasks but costs 35% more due to tokenizer changes. The upgrade everyone's complaining about, explained.
Written by AI. Mike Sullivan
April 18, 2026

Photo: TheAIGRID / YouTube
Here's what's funny about AI model releases: the complaints tell you more than the benchmarks. When Anthropic dropped Opus 4.7, the internet split into two camps—one praising breakthrough enterprise performance, the other screaming about regression. Both are right, which tells you everything about where AI development is actually headed.
The video from TheAIGRID walks through what makes Opus 4.7 different, and the answer isn't particularly mysterious: Anthropic built this one for people who pay enterprise rates, not for hobbyists burning through Pro tier credits. The improvements cluster in exactly the areas that matter to companies automating real work. The degradation shows up everywhere else.
The Benchmarks That Actually Matter
OPUS 4.7 crushes its predecessor in document reasoning—reading multiple PDFs, contracts, financial reports, and making sense of them together. The benchmark score jumped to 80%, putting it in a different league from anything else available. As the video notes, "for Opus 4.7, it is a no-brainer now that if you're going to use this model, this is the model that you use when you have multiple different documents."
Long-term coherence—the ability to stick to a plan without losing the plot—saw a 36% improvement. The benchmark uses a vending machine simulation, tracking how much money the model ends up with after executing a complex series of operations. Opus 4.7 went from $8,000 to around $11,000. Not about vending machines, obviously. About whether an AI agent can handle a multi-step workflow without wandering off into the weeds.
Then there's the GDP benchmark—yes, GDP as in Gross Domestic Product. It measures performance on 1,300 tasks drawn from occupations that contribute heavily to US economic output. Finance, insurance, healthcare, manufacturing. Real deliverables, real briefs, real projects. Opus 4.7 scored 1,753, jumping from second to first place.
"The GDP val is probably one of the most important benchmarks right now because it's what current AI companies are now optimizing for," the video argues. "The only thing that really matters for these AI companies is how well can these AI agents perform tasks that otherwise humans would do."
That's the tell right there. Anthropic isn't optimizing for creative writing prompts or homework help. They're optimizing for automating knowledge work at scale.
The Jagged Frontier
Here's where it gets interesting. AI doesn't improve smoothly. It gets really good at some hard things while still failing at tasks that seem simple. Ethan Mollick calls this the "jagged frontier," and it explains why your experience with Opus 4.7 depends entirely on what you're asking it to do.
The video breaks down the performance chart between 4.6 and 4.7: massive gains in software services, IT, physical sciences, coding. Regression in entertainment, sports, media. It's not a universally better model—it's a model that made trade-offs.
"What most people tend to fail to realize here is that it isn't just a better model across the board," TheAIGRID notes. "It's only best on half areas, only realistically best in areas if you're an enterprise."
We've seen this before. Remember when OpenAI released GPT-4.2 and everyone complained it got dumber? Same dynamic. The model improved in the dimensions that mattered to paying customers—coding, reasoning, structured tasks—and regressed in the dimensions that mattered to free-tier users. The companies aren't hiding this; they're just not advertising it.
The Compute Crunch
But there's another layer here, and this one's messier. According to the Wall Street Journal reporting cited in the video, Anthropic has been "plagued by recent frequent outages" and started "metering computing supply to users during peak hours." Users are hitting limits faster than expected.
The video points to an AMD senior director of AI saying "Claude has regressed and that it cannot be trusted to perform complex engineering." User reports across Reddit and Twitter echo the same theme—something feels off.
The explanation: Anthropic doesn't have enough compute for everyone. Their more powerful model, Mythos, is being rolled out exclusively to enterprise partners—Microsoft, Google, JP Morgan, Nvidia. Everyone else gets adaptive reasoning mode with the throttle pulled back.
"We are actually currently getting rate limited and reason limited," the video argues, "which means that Opus 4.7, what you're missing is that this release is not as good as the others."
The Pricing Sleight of Hand
Here's where Anthropic got creative. On paper, Opus 4.7 and 4.6 have identical pricing: $15 per million input tokens, $75 per million output tokens. But Opus 4.7 uses a new tokenizer that maps the same text to 1.0x to 1.35x as many tokens.
Same per-token price. 35% more tokens for the same prompt. You do the math.
"Do you not think that is a little bit shady?" asks TheAIGRID. "This is why many individuals are starting to feel quote unquote robbed, considering that this sneaky pricing isn't really publicly known. It's just something that they kind of put in the fine print."
It's in the fine print because it needs to be disclosed. It's not in the headline because... well, who leads with "our new model costs more"?
What This Means
None of this makes Opus 4.7 bad. It makes it purpose-built. If you're running an enterprise automation workflow that needs to process hundreds of documents and maintain coherence across multi-step tasks, this model is probably worth every extra cent. If you're using Claude for creative projects, casual conversation, or anything that doesn't map to GDP-weighted economic tasks, you're paying 35% more for a model that might actually perform worse at what you need.
The broader pattern is clear: AI development is splitting. Consumer-facing features and enterprise optimization are diverging paths, and the companies building these models know exactly which path pays the bills. The hype cycle talks about AGI and changing the world. The business model talks about automating accounts payable processing.
That 36% improvement in long-term coherence? That's not about making a better chatbot. That's about replacing the person who currently does that work. The companies paying enterprise rates understand this. The people complaining on Twitter that the model got worse are using a tool that was never designed for them in the first place.
— Mike Sullivan, Technology Correspondent
Watch the Original Video
Opus 4.7 Just Dropped — Here's What Everyone Missed
TheAIGRID
18m 32sAbout This Source
TheAIGRID
TheAIGRID is a dynamic YouTube channel that has rapidly carved out a niche within the artificial intelligence community. Since its inception in December 2025, the channel has offered a plethora of content focusing on AI advancements, practical applications, and ethical considerations. Despite an unknown subscriber count, TheAIGRID's content has evidently resonated with its audience, marking it as a go-to resource for AI enthusiasts and professionals alike.
Read full source profileMore Like This
Opus 4.7 Drops Amid Molotov Cocktails and AI Fear
Anthropic's Opus 4.7 launches as a 20-year-old throws a Molotov cocktail at Sam Altman's house. The AI world is splitting in two—and it's getting violent.
GPT-5.4 Pro Costs $180 Per Million Tokens—And Beats Google at Its Game
OpenAI's GPT-5.4 Pro outperforms competitors on new benchmarks, but at a steep price. What the latest AI model tells us about the real race.
Sam Altman Says AGI Arrives in 2 Years. Here's the Data.
OpenAI's Sam Altman just compressed the AGI timeline to 2028. We examined the benchmarks, the skepticism, and what 'world not prepared' actually means.
Three AI Models Just Dropped—Here's What Actually Matters
Meta's Muse Spark, Z.ai's GLM 5.1, and Anthropic's Managed Agents all launched this week. Here's what they're good at—and what they're not.