Edited by humans. Written by AI. How our editing works
BUZZRAGNews. Trends. Ideas — distilled in minutes.
All articles

Anthropic's Self-Improving AI Paper Has a Regulator Problem

Anthropic's new paper on recursive self-improvement reveals an oversight gap that existing AI regulation—EU AI Act, executive orders—was never designed to address.

Samira Barnes

Written by AI. Samira Barnes

June 5, 20268 min read
Share:
Man in gray shirt smiling at camera with red starburst graphic and "LIVE" indicator on dark textured background

Photo: AI. Ren Takahashi

Anthropic published a paper this week with a title that functions as its own thesis: When AI Builds Itself. The document describes, with considerable internal data, how the company has delegated a growing share of its own AI development to Claude — its AI system — and what that delegation implies for the future of human involvement in frontier model research. As of May 2026, the paper states, more than 80% of the code merged into Anthropic's codebase was authored by Claude.

That figure deserves more than a capability conversation. It deserves a regulatory one.

The paper's account of how Anthropic arrived at that 80% number tracks a progression that AI commentator Matthew Berman walked through in detail in a recent livestream analysis. In Anthropic's early days, human engineers wrote code that built Claude. Then came the chatbot era — humans prompting models to assist with development. Then coding agents. Then autonomous sub-agents operating in parallel. Each layer added output and removed a human from direct contact with the underlying system. The paper is frank about where this leads: "If this happens, future versions of Claude could be continuously improved by Claude itself." The only remaining bottleneck, the document notes, would be compute.

What Berman correctly identifies as the sharpest tension in this paper is the review problem. If Claude can generate eight times as much code per engineer, human review cannot keep pace. Anthropic's reported solution: Claude judges Claude. Session success is determined by an AI evaluator, not a human one.

This is where the capability story becomes a governance story, and where existing regulation is exposed as structurally unequipped for what Anthropic is describing.

What Article 14 Actually Requires

The EU AI Act, which entered into force in August 2024 and is being phased into application, classifies certain AI systems as high-risk and mandates human oversight under Article 14. The provision requires that high-risk systems be designed so that natural persons can "effectively oversee" them, intervene, and "interrupt the system." The key qualifier is "effectively." The Act's drafters were imagining human operators with a meaningful ability to understand and intervene in system outputs — not a scenario where humans are, by the developer's own admission, abstracted further from the underlying details with each generation of tooling.

Anthropic's internal development pipeline — where AI writes code, AI reviews that code, and AI is used to accelerate the research that produces the next model — does not obviously satisfy the spirit of Article 14, even if the company could argue it satisfies the letter. The humans nominally in the loop are reviewing outputs from systems they are increasingly disconnected from. The paper itself acknowledges this: "human review will become the bottleneck to AI development." That framing treats human oversight as a constraint to be engineered around rather than a feature to be preserved.

Berman notes the paper's own tension here: "You can offload your thinking, meaning you can have AI build the systems... but ultimately a human needs to understand it. If they don't, that is the recipe for AI misalignment."

That observation is correct, and it maps directly onto a regulatory gap that scholars of algorithmic accountability have been flagging for years. Margaret Hu, a law professor at William & Mary who writes on AI governance and constitutional questions, has argued that meaningful human oversight requires not just formal authority to intervene but genuine comprehension of what is being intervened upon. When the systems become opaque to their own developers — which is precisely the trajectory Anthropic is describing — the oversight requirement becomes procedural theater rather than substantive protection.

The EU AI Act does not have a clear answer for what "effective oversight" means when the humans in the loop cannot actually follow the chain of decisions that produced the output they are nominally reviewing.

The Executive Order Gap

The Biden administration's October 2023 executive order on AI (E.O. 14110) required developers of frontier models to report safety evaluations to the federal government and established that models posing certain risks would be subject to review before deployment. That order was revoked by the current administration in January 2025, replaced by a framework oriented toward removing barriers to AI development rather than establishing oversight floors.

What neither framework addressed — because the scenario was not yet clearly visible when either was drafted — is recursive self-improvement specifically: the condition where the safety properties of a new model are evaluated primarily by the previous model rather than by independent human reviewers. The Anthropic paper describes this as an emerging operational reality, not a hypothetical. The policy apparatus has not caught up.

This matters for liability as much as it matters for safety. When Anthropic deploys a version of Claude whose capabilities were substantially shaped by a prior Claude, evaluated for safety by that prior Claude, and whose code was written largely by that prior Claude — who bears legal responsibility if something goes wrong? The current liability frameworks in the United States, which treat AI outputs as the responsibility of the deploying company, were designed for simpler chains of causation. The EU's emerging product liability revisions are more sophisticated, but they too were not written with recursive development loops in mind.

The Strategic Safety Argument, Examined

The paper's most politically legible section is its call for a global development slowdown — contingent, of course, on all other frontier labs slowing simultaneously. Berman is characteristically direct about the optics: "That's a nice thing to say when you're in the literal lead, when you are winning the race."

He's not wrong. Anthropic reportedly withheld its Mythos model from public release — after cutting off competitor xAI's access to its API — while deploying Mythos internally to accelerate its own research. The paper that calls for a global pause was written by a company actively using an unreleased frontier model to compound its advantage. The paper even acknowledges this structural problem: "if a slowdown simply lets the least cautious actors catch up technologically, it could leave everyone less safe." The argument for slowing down doubles as a justification for staying ahead.

The arms control comparison the paper invokes is illuminating, but not quite in the way Anthropic intends. The paper notes that nuclear verification regimes took decades to build and that training runs are "far easier to conceal than missile silos." True. But the lesson most arms control scholars would draw from that comparison is that verification architecture needs to be built before the capability proliferates, not after. The window for that kind of international coordination on AI may already be narrowing.

What the Data Actually Shows

Strip away the strategic framing and the paper contains some genuinely significant internal measurements. According to Anthropic's figures, the length of tasks that AI agents can reliably complete has been doubling roughly every four months — accelerating from an earlier trend of doubling every seven months. On the benchmark Anthropic calls CoreBench, which tests a model's ability to reproduce existing research results, performance went from approximately 20% success in 2024 to near-saturation within 15 months.

The paper also reports that Anthropic's Mythos preview model achieved a 52x speedup on a code optimization task in April 2026, compared to a 3x average for an earlier model in May 2025. Per the paper's own framing, this figure measures improvement over a baseline starting point in a controlled code rewriting test — though readers should note this figure comes from Anthropic's internal research and has not been independently verified. On the productivity side, a March 2026 internal poll of 130 Anthropic employees found that the median respondent estimated four times as much output with Mythos compared to working without any AI assistance — even while code volume had increased eightfold. The gap between those numbers, as Berman's analysis makes clear, means Claude-written code is being produced in quantity while quality remains roughly at human par, by Anthropic's own assessment.

The paper is also honest that novel research direction — deciding what to investigate, not just how to execute an investigation — remains outside current model capabilities. Claude Haiku 3 would have made better experimental decisions than humans approximately 22% of the time in retrospective analysis; Mythos preview reaches 64%. That is meaningful progress. It is not yet the missing ingredient: genuine novelty generation rather than synthesis and optimization.

The Oversight Bottleneck Is the Policy Question

The most consequential sentence in Anthropic's paper is not about intelligence explosions or the permanent underclass. It is this: human review will become the bottleneck to AI development.

If that is true — and Anthropic's own data suggests it is already becoming true — then the entire framework of human oversight that regulators on both sides of the Atlantic have built their AI governance strategies around is solving the wrong problem. The EU AI Act's Article 14 assumes humans can oversee effectively if given the right tools and authority. Anthropic's paper describes a development environment where the tools have outpaced the humans, and where the company's response to that gap is to deploy AI oversight in place of human oversight.

That substitution may be technically rational. It may even be safer in some narrow sense. But it is not what any existing regulatory framework contemplates, and no regulator has yet answered the question of whether AI-judging-AI satisfies the human oversight requirements those frameworks demand.

Anthropic published a paper about what happens when AI builds itself. The regulatory world has not yet published its response.


Samira Barnes covers technology policy and regulation for Buzzrag.

From the BuzzRAG Team

AI Moves Fast. We Keep You Current.

Framework breakdowns, tool comparisons, and AI coding insights — distilled from the best tech YouTube creators. Free, weekly.

Weekly digestNo spamUnsubscribe anytime

More Like This

Bold "AWESOME DESIGN.md!" text overlays a design interface with an upward arrow and "Generating Design" progress indicator…

Design.md Files Expose a Gap in AI Regulation Standards

How a GitHub repository of design system files reveals the absence of standardization frameworks for AI-generated interfaces—and why that matters.

Samira Barnes·2 months ago·8 min read
A man in dark business attire holding a microphone against a black background, with Microsoft logo and red text warning of…

Microsoft AI Chief Predicts White-Collar Job Automation

Mustafa Suleyman says AI will automate most white-collar tasks within 18 months. What the data shows—and what policymakers aren't prepared for.

Samira Barnes·3 months ago·6 min read
Shocked man with hand on head beside Earth map showing red warning symbols and "GLOBAL AI PAUSE" text with pause button icon

AI Labs Call for a Global Pause Mechanism on AI

Top AI leaders signed a letter urging synthetic biology screening, while Anthropic published a stark assessment of recursive self-improvement and why a pause mechanism matters.

Rachel "Rach" Kovacs·7 days ago·7 min read
A smiling person next to a beige folder icon with an orange square containing a white starburst symbol and "/grill-me" text…

AI Knowledge Gaps Are a Governance Problem

When AI systems encode stale or incomplete institutional knowledge, who's liable? A workflow technique surfaces a regulatory blind spot nobody's addressing.

Samira Barnes·1 week ago·7 min read
Two men discuss AI research with "JEPA PART 2" text and technical diagrams visible behind them against a dark background

LeCun's JEPA Roadmap Has a Regulatory Gap

Yann LeCun's JEPA world models could reshape industrial AI—but his deployment roadmap runs straight into regulatory frameworks nobody has updated yet.

Samira Barnes·2 weeks ago·7 min read
Retro pixel art character surrounded by glowing chain links and a brain icon, with "SKILL CHAINING" text and "CLAUDE CODE"…

Claude Code's Skill Chaining Raises Automation Questions

Anthropic's Claude Code now allows sequential skill execution through 'context fork' commands. Technical advancement or regulatory blind spot?

Samira Barnes·2 months ago·6 min read
A cartoon penguin confined in a transparent blue box against a light blue background, illustrating the concept of file…

Why Regulators Should Care About C Programming Skills

A file compression tutorial reveals the technical knowledge gap undermining tech regulation—and why lawmakers need to understand what they're trying to govern.

Samira Barnes·3 months ago·6 min read
Hand holding a three-fan graphics card against dark background with "SKIP THIS" text and V-ray logo in yellow neon border

NVIDIA's GPU Pricing Mystery: Why Older Is Cheaper

New V-Ray benchmarks reveal NVIDIA's 50-series offers minimal gains over 40-series cards, raising questions about upgrade economics and market strategy.

Samira Barnes·3 months ago·5 min read

RAG·vector embedding

2026-06-05
2,005 tokens1536-dimmodel text-embedding-3-small

This article is indexed as a 1536-dimensional vector for semantic retrieval. Crawlers that parse structured data can use the embedded payload below.