Claude Code Now Supports Nested Subagents

If you're already using Claude Code subagents to keep your main context window clean, Anthropic just handed you a new capability that's worth understanding before you deploy it. If you're not using subagents yet, this is probably the moment to get up to speed—because context pollution is about to get a lot more addressable, and the tradeoffs are about to get more interesting too.

Anthropic has added nested subagent support to Claude Code. That means a subagent can now spawn its own subagents. Which can spawn their own. According to AI agent workflow creator Ray Amjad, who covered the announcement in detail on YouTube, the architecture can go multiple layers deep—though the precise supported depth should be verified against Anthropic's current release documentation before you design workflows around it.

Why this matters for your actual work

The original subagent model solved one real problem: tool-calling noise. When Claude Code explores a codebase, searches the web, or runs bash commands inside your main context window, the window fills up fast. A full context window makes the model's decisions worse. So the solution was to isolate that noisy work in a subagent, get back a clean summary, and keep the main session lean.

Nested subagents solve the same problem one layer down. As Amjad explains it: "What if this sub agent decided that it also needed to delegate noisy tool calling because it was kind of changing paths or considering another idea? If it did it inside of a sub agent's context window, then that sub agent could become even noisier and not actually lead to any useful result back for the main conversation."

The fix: let that layer-one agent offload its own noise to a layer-two agent. And so on. The main context window stays lean. Each layer returns only its most relevant result to the layer above. Everything collapses cleanly back to the session that needs to act on the findings.

The practical applications Amjad walks through are genuinely useful for anyone doing complex agentic work. A research verification flow where each article gets its own layer-one agent, and each claim verification gets its own layer-two agent—all running in parallel. A legal contract review where layer-one agents assess each contract's implications, layer-two agents measure the blast radius of proposed changes, and layer-three agents run the web searches to verify edge cases. A debugging flow where causes fan out to parallel layer-two agents, one of which delegates server log cross-checking to a layer-three agent, and everything collapses back to the main session with a clean diagnosis.

The mental model Amjad offers—second-order effects handed to layer two, third-order effects to layer three—is actually a clean way to think about this. You're not just parallelizing work; you're containing the cognitive mess of exploration so that decisions get made with clean information at every level.

The part nobody's talking about yet

Here's where I have to be the correspondent and not the cheerleader: this architecture meaningfully expands the attack surface of a compromised or misbehaving agent, and I haven't seen that discussed much in the initial coverage.

Think about what nested subagents actually do. They run bash commands. They query the web. They search your codebase and email history. They do all of this autonomously, potentially across multiple layers, before any result surfaces to a human. A prompt injection buried in a web search result at layer three doesn't need to compromise your main session directly—it just needs to manipulate the layer-two agent's output, which shapes the layer-one agent's conclusion, which shapes what your main session acts on. The token cost problem we've already noted with multi-agent systems gets compounded here: more agents, more web calls, more surface area for something to go sideways.

I'm not saying don't use this. I'm saying: if you're deploying nested subagents on anything sensitive—legal documents, production codebases, anything touching external data sources—you need to think explicitly about what permissions each layer carries and what happens if a layer-two agent's web search returns something adversarial. Right now, that responsibility is entirely on you. Anthropic hasn't published visible guardrails specific to nested depth, and the models aren't yet reliably deciding on their own when spawning a deeper agent is appropriate versus when it's overkill. As Amjad notes, "you may want to manually prompt it or manually override it" until the models get better at making those calls autonomously.

That's not a knock on the feature. It's just the honest state of where we are.

OpenAI got here first on the config side

Worth noting: Amjad points out that OpenAI's Codex app has had a configurable nesting parameter for a while. He describes a max_depth setting in the Codex configuration file—set it above the default and your agents can call other agents. Given how quickly product specs change in this space, verify the current Codex documentation if you want to use it, but the point stands: Claude Code is catching up on a capability that's been available elsewhere.

That competitive context matters for how you evaluate Anthropic's announcement. This isn't Anthropic inventing nested agentic delegation; it's Anthropic bringing it into Claude Code's native architecture. The interesting question is whether Claude's model capabilities will make it better at deciding when to nest, as opposed to needing explicit prompting to do so.

Skill optimization is the use case I'd actually run

Beyond the workflow architecture, Amjad covers one application that strikes me as genuinely worth your time if you're running the same Claude Code skills repeatedly: automated skill optimization. The idea is to have a subagent run your current skill, evaluate its output, write a modified version, run that, compare results, and iterate—potentially overnight. Amjad says he ran this process on a spec-writing skill using Codex when it first gained nested subagent support, ending up with a version he found meaningfully better than the original.

He references a Microsoft research paper on automated skill prompt improvement as inspiration for this approach, though I can't verify the specific paper, authors, or publication date from what's available—so treat that as a pointer to an active research direction rather than a citation. The concept itself is sound: if you're running a skill daily, spending tokens on a one-time optimization run that permanently improves it is worth doing the math on.

The math, for what it's worth: if a skill runs 20 times a day and an overnight optimization run makes it 15% more efficient, the break-even on that optimization run's token cost is probably days, not months. That's worth running.

What to actually do with this right now

If you're a Claude Code user, here's my read: don't restructure existing workflows just because nesting is now possible. Identify one workflow where your layer-one subagent is itself getting noisy—where it's doing exploratory searching that muddies its final result—and try adding a layer there. Start shallow. One nested layer, constrained permissions, on non-sensitive work. See if the output quality justifies the added complexity and token cost before you build a five-layer architecture.

And before you hand nested agents access to external web searches, email archives, or production systems, spend fifteen minutes thinking about what a malicious or misleading result at the deepest layer could propagate upward into. That's not paranoia. That's just how you use powerful tools responsibly.

The agent collaboration space is moving fast enough that capabilities are outpacing both documentation and security guidance. That gap is your problem to manage until it isn't.

Rachel "Rach" Kovacs is Buzzrag's cybersecurity and privacy correspondent.