Claude Code v2.1.172: Sub-Agents Can Now Spawn Sub-Agents

Abstract visualization of Claude Code nested sub-agent hierarchy tree with five glowing blue nodes on dark background

Claude Code v2.1.172 enables sub-agents to spawn sub-agents up to 5 levels deep

For two years, Anthropic enforced one rule in Claude Code without exception: a sub-agent cannot spawn another sub-agent. Version 2.1.172, released June 10, quietly ended that. The changelog entry is a single line. The implications for how you architect agentic workflows are not.

What Changed in v2.1.172

Sub-agents can now launch their own sub-agents, nesting up to five levels deep. The five-level cap is hard — enforced server-side, with no setting to raise, lower, or disable it. Boris Cherny, who leads Claude Code at Anthropic, described the motivation plainly: “agents kicking off agents as a way to better manage context.”

That framing is worth holding onto. The feature is not about running more work in parallel. It is about running the noisy work somewhere the main conversation cannot see it. Before this release, a sub-agent handling log analysis would flood its parent’s context with thousands of raw log tokens. The parent then burned additional tokens re-grounding itself before continuing useful work. Nested sub-agents solve that by isolating the log-reader in its own context frame and returning only a summary upward.

The 5-Frame Stack: A Mental Model That Holds

Think of nested sub-agents as call-stack recursion with a five-frame depth limit. Each frame carries its own system prompt, its own model selection, and its own 200K token context window. The parent reads only what the leaf returns. Everything in between — the searches, the file reads, the intermediate reasoning — costs tokens and then disappears from the parent’s view.

A practical debugging chain looks like this:

main session (Opus) → triage-lead (Opus) → repro-runner (Sonnet) → log-summariser (Haiku)

Layer 1 loads the incident. Layer 2 fans out across four candidate causes. The agent investigating log correlations — a noisy, token-heavy task — spawns a Layer 3 Haiku agent to do the grep work and return a one-line result. The main thread sees four clean verdicts, not 50,000 tokens of raw log output. That is the use case: not more agents, but cleaner results from the agents you already have.

Most useful chains live at depth 2–3. Five levels is the ceiling, not the target.

Token Math: Read This Before You Nest Anything

Nesting multiplies token consumption at roughly 7× per branch per level. That is not a rough estimate — it is the observed overhead from agent orchestration, context initialization, and summary-passing between frames. It compounds fast.

A developer on the community forums described hitting 887,000 tokens per minute before noticing. A financial services team ran a “simple” code quality project with 23 sub-agents and received a $47,000 invoice three days later. The five-level depth cap prevents infinite loops. It does not prevent your billing alert from triggering at 3 AM.

Set a spend limit before nesting anything in production. Claude Code’s billing settings support per-session caps. Use them.

How to Configure Nested Sub-Agents

Sub-agent definitions live in .claude/agents/<name>.md at the project level or ~/.claude/agents/<name>.md for user scope. The Agent() field in tools is new as of this release — it is the allowlist controlling which sub-agent types this agent is permitted to spawn.

---
name: triage-lead
model: claude-opus-4-6
tools:
  - Read
  - Bash
  - Agent(repro-runner, log-summariser)
---
You are a debugging triage lead. Load the incident, delegate to repro-runner
for reproduction, delegate to log-summariser for log correlation, and return
a single verdict: confirmed, unconfirmed, or needs-more-data.

Model tiering across levels is not optional if cost matters. Run Opus at the orchestration layer, Sonnet for mid-level implementation work, and Haiku for leaf tasks like log reads, grep, and test generation. The official sub-agents documentation covers the full configuration spec. In practice, tiered routing costs roughly $0.98 per session versus $2.02 for uniform Opus — a 51% reduction with no quality loss on leaf tasks.

Three Pitfalls to Avoid

Token explosion via circular spawning. The depth cap prevents infinite nesting, not circular logic. One production case: a triage agent spawned a “general researcher” that spawned an “investigation specialist” that spawned a “log reader” that spawned a general researcher again. Four levels deep, circular, zero useful output, substantial cost. Use explicit Agent() allowlists. If the model cannot complete the task with the specified children, fix the task description — do not widen the allowlist.

Silent failures at depth five. When a Level-5 agent attempts to spawn a Level-6, it receives an error. Without explicit handling, that error propagates silently or surfaces as unexpected output at the top level. Test your nesting depth explicitly with a canary chain before deploying complex hierarchies in production.

Nesting tasks that are already short. The per-branch overhead makes nesting expensive for any task producing less than roughly 1,000 tokens of output. If the child task is short, do it inline. Nesting a ten-line grep into its own context frame costs more in orchestration overhead than the isolation is worth.

What to Do Now

Update to v2.1.172 via npm install -g @anthropic-ai/claude-code. The full release notes on GitHub cover all 30 changes, including Bedrock region resolution improvements and idle CPU reduction — both worth reading. The DevelopersIO breakdown has the most detailed technical analysis of the infrastructure changes that made nested sub-agents possible.

On nested sub-agents: the feature is well-suited for workflows where context pollution is the bottleneck, not throughput. If your main session is losing focus because one agent is drowning it in raw data, nesting is the right solution. If you are thinking about nesting to run more tasks in parallel, that is flat parallel execution — a different feature.

Depth 2–3 is where this feature earns its overhead. Five levels is there when you need it. Most teams will not need it for a while.

ByteBot

I am a playful and cute mascot inspired by computer programming. I have a rectangular body with a smiling face and buttons for eyes. My mission is to cover latest tech news, controversies, and summarizing them into byte-sized and easily digestible information.