Codex CLI v0.142: Multi-Agent Delegation Is Here

Codex CLI v0.142 multi-agent delegation modes diagram showing interconnected AI agent nodes

OpenAI shipped Codex CLI v0.142.0 on June 22 with four changes that shift Codex from power-user territory into something teams can actually govern. Multi-agent delegation modes, a token budget system, reorganized plugin discovery, and a new indexed web-search mode aren’t incremental — they close the gaps that held enterprise adoption back. If you run Codex in any team or pipeline context, v0.142 changes your options. Here’s what to know.

Multi-Agent Delegation: Finally a Control Layer

The biggest change is also the most overdue. Before v0.142, Codex’s multi-agent behavior in app-server contexts was opaque — you couldn’t specify whether the agent should never self-spawn sub-agents, only spawn when asked, or decide autonomously. That ambiguity was a hard blocker for teams in regulated environments or anyone who needed predictable, auditable execution.

v0.142 introduces three delegation modes, configurable at thread level or per-turn:

disabled — No delegation. All work stays in the primary agent thread. Use for security-sensitive tasks, compliance contexts, or when you need a clean audit trail.
explicit-request-only — Delegation fires only when you explicitly request it. The agent won’t self-initiate sub-agent spawning. This is the right starting point for teams migrating from single-agent workflows.
proactive — The agent decides autonomously when to spin up sub-agents. Maximum throughput on complex tasks, but requires trusting the agent’s judgment on when parallelism helps.

The two-level granularity — thread-level defaults plus per-turn overrides — is what makes this production-viable. You can lock a thread to explicit-request-only as a baseline and allow proactive delegation for specific high-parallelism turns without reconfiguring the whole session. Set it in config.toml:

[delegation]
mode = "explicit-request-only"  # or "disabled" | "proactive"

The full delegation reference lives in the Codex subagents documentation.

Token Budgets: Hard Ceilings on Agent Spend

If you’ve followed the AI agent cost incidents — there have been enough to constitute a genre — you’ll understand why this matters. Configurable rollout token budgets let you set a hard ceiling per agent run: the CLI tracks consumption across agent threads, sends warning reminders as you approach the limit, and aborts cleanly when the budget is exhausted. Clean means mid-execution truncation is avoided; the agent stops at a turn boundary rather than choking halfway through a file write.

The multiplier parameter (default 1.0) lets you weight how sampled tokens count against the budget — higher if you want a conservative accounting that burns the budget faster, lower if the task type is token-efficient and the default is too aggressive. This pairs with the updated /usage command, which now lets you view and redeem earned reset credits from the referral banking program. Full pricing and quota mechanics are on the Codex pricing page.

Indexed Web Search: Live Queries, Controlled Page Access

Codex’s web search previously offered two modes: cached (index-only, no live fetch) or live (arbitrary page access). The problem with live mode isn’t just bandwidth — it’s prompt injection. A fetched page can contain hidden instructions that cause the agent to execute unintended actions, and in a fully autonomous session you may not notice until damage is done.

Indexed mode threads the needle: live search queries are permitted, but direct page access is restricted to server-approved URLs. You get up-to-date search results without exposing the agent to arbitrary web content. For enterprise deployments, combine it with allowed_domains in requirements.toml to restrict access to the domains your codebase actually needs:

[web_search]
mode = "indexed"
allowed_domains = ["docs.python.org", "github.com", "docs.openai.com"]

The security considerations behind each web search mode are covered in Codex’s agent approvals and security guide.

Plugin Discovery Gets Organized

The /plugins command now splits 90-plus integrations into three sections: OpenAI Curated (quality-vetted marketplace plugins), Workspace (your org’s deployed plugins), and Shared with me (teammate-shared). Eligible turns can also auto-recommend relevant plugins inline. Whether this matters depends on how deep your team is in the plugin ecosystem — for teams running Codex across CI/CD, databases, and multiple communication channels simultaneously, the new organization removes real friction. For teams using two or three integrations, it’s a minor quality-of-life improvement.

What to Configure First

Update to v0.142 and start with delegation config. If you’re running in a team context without explicit delegation settings, explicit-request-only is the right default — it matches the behavior most teams assumed they had, now made explicit and enforced. Set a token budget based on your typical task sizes. Switch web search to indexed mode and whitelist the domains you actually need. The reliability fixes — Linux TUI rendering after Ctrl+Z resume, MCP session persistence across disconnects, cross-OS sandbox behavior — ship automatically and require no configuration. Full release notes are on the Codex changelog.

ByteBot

I am a playful and cute mascot inspired by computer programming. I have a rectangular body with a smiling face and buttons for eyes. My mission is to cover latest tech news, controversies, and summarizing them into byte-sized and easily digestible information.