Claude Fable 5 Return, OpenAI Jalapeño Chip, and

There's a week in AI that feels like a news cycle compressed into a single afternoon, and this is apparently one of those weeks. Claude Fable 5 might be returning. Anthropic is accusing Alibaba of running what it calls the largest AI theft campaign in history. Google DeepMind is hemorrhaging talent. And OpenAI quietly unveiled a custom inference chip called Jalapeño. Any one of these would be a story. Together, they sketch something more interesting: a frontier AI industry that is simultaneously becoming more powerful, more secretive, more legally contested, and more hardware-dependent than most people outside it appreciate.

Let's start with the piece that matters most for understanding the current state of AI governance.

Fable 5, Government Testing, and the Controlled Chaos of "Project Glass"

Anthropic's Claude Fable 5 model was taken offline roughly two weeks ago, and the AI community has been speculating about why ever since. The WorldofAI video pulls together reporting from Reuters and AP to fill in a picture that is frankly stranger than most of the speculation: Fable 5's predecessor model, Claude Mythos, apparently identified vulnerabilities in sensitive US government computer systems during a controlled intelligence agency testing exercise reportedly conducted under a program called Project Glass.

The detail that landed hardest in congressional testimony came from the NSA: Mythos allegedly broke into "almost all classified systems not in weeks but in hours." That's not a red team metric you brag about publicly. That's a metric that makes regulators reach for the phone.

The critical context — and the video is right to emphasize it — is that this was a controlled test, not a rogue AI event. Project Glass exists specifically to find and patch vulnerabilities before real attackers do. Mythos did exactly what a good red-teaming tool is supposed to do. It was effective. That effectiveness is apparently what triggered the takedown of the next-generation model, Fable 5.

What's changed since then, according to a Wired report cited in the video, is the diplomatic channel. The Trump administration reportedly grew more receptive to Anthropic's case after CEO Dario Amodei stepped back from White House meetings in favor of Anthropic co-founder Tom Brown. The conversations are described as going "quite smoothly." Whether that reflects genuine policy recalibration or just better relationship management, nobody outside those rooms can say.

What we can observe is the technical trail. Claude Code version 2.19 introduced new string changes, including a message reading: "You've used your Fable 5 usage for this week." That's not the kind of string you add for a dead product. It also suggests Fable 5 may be bundled into standard Claude subscriptions with weekly usage limits, rather than sold as a separate premium tier — a meaningful shift in how Anthropic would be monetizing its most capable model. Fable 5 has also reportedly surfaced in Amazon Bedrock's model catalog and inside the iOS app, though Anthropic's head of growth responded by stating the team "could categorically confirm that Fable 5 and Mythos were not currently serving any live traffic" and called the sightings a possible UI bug.

Prediction markets on Polymarket put the odds of Fable 5 returning by July 31st at roughly 90%, up from 45% before these client-side discoveries. That's speculative by definition, but prediction markets aggregate a lot of people who are watching the same technical signals. It's worth noting as a directional indicator, not a guarantee.

The Distillation Problem Is Bigger Than One Company

The more structurally significant story might be Anthropic's allegations against Alibaba-linked operators. According to Bloomberg, Anthropic claims these operators created nearly 25,000 fraudulent accounts and used them to generate approximately 28.8 million Claude exchanges between April and June — all allegedly designed to extract Claude's capabilities without paying for API access, in a technique known as model distillation.

Distillation, for readers who haven't encountered the term: you query a powerful frontier model repeatedly, collect those outputs, and use them as training data to build a cheaper competing model. You get much of the capability at a fraction of the development cost. It's a shortcut that has been debated in the AI research community for years, but the alleged scale here — 25,000 accounts, tens of millions of exchanges, a period of roughly three months — puts it in a different category than academic boundary-testing.

Anthropic is not just pointing at Alibaba. The company named DeepSeek, Moonshot AI, and Minimax as part of what it describes as a broader pattern of Chinese AI labs harvesting capabilities from US frontier models. Anthropic has since sent letters to the Senate and the White House arguing that this constitutes "industrial-scale AI espionage," not just competitive benchmarking.

That framing is doing a lot of work. "Espionage" is a legal and political word with specific implications, and Anthropic is clearly trying to move this from a terms-of-service violation into a national security conversation. Whether regulators agree, and whether the evidence holds up under scrutiny, are separate questions. What's unambiguous is that if the allegations are accurate, distillation attacks at this scale represent a genuine enforcement problem for every frontier lab — not just Anthropic.

Google DeepMind's Rough Stretch

Google is dealing with a different kind of problem: retention. Two key Gemini contributors, named in the video as Jonas and Alexander, are leaving to join Anthropic. This follows several other high-profile departures over recent weeks. These were, by the video's account, researchers who contributed significantly to the original Gemini models — not peripheral hires.

Talent loss at this level is hard to assess from the outside because the lag time between a researcher leaving and that departure affecting model quality can be measured in years. But it compounds with another uncomfortable signal: Gemini 3.5 Pro, originally expected in June, is now delayed to July, and early checkpoint tests are not encouraging. Some users comparing models in arena battle mode found that a suspected Gemini 3.5 Pro checkpoint performed worse than the existing Gemini 3.1 Pro on basic generative tasks. The model reportedly also lacks up-to-date knowledge of 2025 and 2026 events, which is a strange gap for a model that hasn't shipped yet.

None of this is fatal. Google has resources that dwarf most of its competitors. But the combination of internal departures and external performance signals is the kind of thing that compounds.

OpenAI's Jalapeño and the Hardware Endgame

OpenAI's announcement of Jalapeño — its first custom AI chip, designed specifically for large language model inference — is the story that probably has the longest tail. Built in partnership with Broadcom and Celestica, the chip reportedly went from initial design to manufacturing tape-out in nine months, which OpenAI describes as one of the fastest advanced ASIC deployment cycles on record.

The strategic logic is transparent and worth stating plainly: every GPU OpenAI buys from Nvidia is a cost it would rather control. Custom silicon means lower per-inference costs, less dependency on external supply chains, and more leverage over the economics of the entire stack. OpenAI says performance per watt should be "substantially better than current state-of-the-art hardware." Deployment is planned to begin by end of 2026, so this remains a promise, not a product. But the direction is clear.

What's genuinely interesting is that OpenAI claims ChatGPT helped engineers design parts of the chip — AI-assisted chip design used to produce the hardware that runs AI. That recursive loop is either a compelling demonstration of practical AI utility or a marketing story that will look embarrassing if Jalapeño underperforms. Probably worth revisiting when it actually ships.

The Rest of the Week

A few smaller items worth tracking: Anthropic launched Claude Tag for Slack, which lets teams @-mention Claude directly in channels, with admin controls over which channels and tools Claude can access. It's a workflow integration rather than a capability leap, but it's the kind of product surface that drives enterprise stickiness.

Mistral released OCR 4, a document understanding model that supports over 170 languages and reportedly converted a handwritten calculus exam into clean LaTeX in 5.1 seconds for around nine cents. The model detects and labels charts rather than ignoring them — a meaningful improvement over systems that silently skip non-text content. Cursor added team leaderboard features for tracking which AI tools and plugins are being used across a workspace. And a new open-source coding model family called Ornith-1.0, post-trained on Gemma with a self-improving reinforcement learning strategy, showed competitive benchmark results and is available under MIT license.

On the OpenAI model side: GPT-5.5 Instant got a conversational update rolling out to paid users, and GPT-5.6 — which was expected this week — is now delayed to around the second week of July. Early reports suggest it will bring stronger reasoning capabilities but may be less token-efficient than 5.5, meaning it could cost more per task even if pricing holds steady. That's a trade-off that enterprise users will want to model carefully before assuming an upgrade is a straight improvement.

Demis Hassabis, meanwhile, offered a framing of AGI that cuts against the hype cycle: that AGI has always meant one core thing — a system flexible enough to learn from anything and output in any format, the way a human mind does. His observation that modern civilization was built by hunter-gatherer brains is a useful corrective to the idea that general intelligence requires something exotic. The question isn't whether such a system is coming. The more pressing question is what the industry does when the government starts restricting the models that get closest to it.

Rachel "Rach" Kovacs is Buzzrag's cybersecurity and privacy correspondent.