
Containers have become the default way to sandbox AI-generated code — and for most workloads, that’s fine. But for AI agents executing hundreds of short-lived scripts per minute, containers are a bad fit: they take seconds to start, consume hundreds of megabytes each, and can’t realistically be treated as disposable. Cloudflare’s Dynamic Workers, now in open beta, is a direct answer to this problem. Instead of containers, it uses V8 isolates — the same sandboxing primitive powering Cloudflare Workers since 2018. The result: cold starts under 5 milliseconds, memory usage in single-digit megabytes, and a pricing model that makes per-request sandboxing economically viable.
V8 Isolates, Not Containers
The engineering premise behind Dynamic Workers is simple: Cloudflare Workers have always run inside V8 isolates. Dynamic Workers surfaces that primitive directly, letting your code spin up a new Worker at runtime with arbitrary JavaScript or TypeScript — fully isolated, fully sandboxed, on the same machine.
What makes isolates so much faster than containers? They don’t touch the host kernel. V8 isolates execute entirely in userspace, which eliminates the overhead of Linux namespace creation, cgroup setup, and filesystem layering that containers require. The result is a cold start under 5ms versus 3+ seconds for a typical container — roughly 100x faster. Memory per isolate runs 2–10 MB versus 50–200+ MB for containers.
The practical implication: isolates are cheap enough to treat as disposable. Spin one up, run some code, throw it away. At $0.002 per unique Worker per day (waived during the current beta), running 1 million requests through Dynamic Workers costs approximately $200 per month. The equivalent container warm-pool infrastructure runs $2,000+ per month before you factor in engineering time to manage it.
Security isn’t a concession. Cloudflare runs a five-layer isolation stack: automatic V8 patching (security updates in hours, not weeks), tenant cordoning that moves high-risk Workers to isolated nodes, Intel MPK hardware-level memory protection, pre-execution code scanning that blocks infinite loops and memory bombs, and default network isolation with credential injection for permitted egress traffic.
Code Mode Cuts Token Costs by 99.9%
The performance story is compelling, but the token cost story is more interesting.
Traditional AI agents use tool calls: the LLM receives a list of tool definitions (often dozens of them), picks one, calls it, gets a result, picks the next tool, and repeats. For complex tasks, this means multiple LLM round-trips and thousands of tokens just to describe available tools. For a large API, the overhead compounds fast. Cloudflare’s own API has 2,500+ endpoints — describing them as individual tool calls consumes over 1.17 million tokens.
Code Mode is a different pattern. Instead of tool definitions, you give the LLM a TypeScript API schema — interfaces and types, not verbose JSON tool descriptors. The LLM writes a single TypeScript function that chains all the necessary API calls. That function runs in a Dynamic Worker. One round-trip, one LLM invocation. Cloudflare’s MCP server built on Code Mode exposes those same 2,500 endpoints through just two tools (search and execute) in under 1,000 tokens total — a 99.9% reduction.
For typical agent tasks, Cloudflare reports an 81% token reduction versus tool-calling. The reason it works is structural: LLMs are trained on code. A TypeScript interface is more information-dense than a JSON tool schema. The model understands how to chain typed function calls without being walked through each step in natural language.
Stateful AI Apps with Durable Object Facets
Ephemeral code execution handles a large class of agent tasks, but some AI-generated applications need persistent state. Durable Object Facets, currently in open beta alongside Dynamic Workers, addresses this.
The pattern is a supervisor architecture. A developer-written parent Durable Object (the “AppRunner”) manages lifecycle and access control. AI-generated code runs inside a child “facet” — a separate Durable Object instance with its own isolated SQLite database. The two databases are stored as a single unit with zero-latency reads, but isolated from each other. The supervisor can track object creation, enforce limits, and add billing hooks without touching the generated code.
let facet = this.ctx.facets.get("app", async () => {
let worker = this.#loadDynamicWorker();
let appClass = worker.getDurableObjectClass("App");
return { class: appClass };
});
return await facet.fetch(request);
This enables AI-generated apps with persistent UIs, multi-tenant platforms where each tenant’s code runs in full isolation, and agent-built prototypes that survive across sessions. Compatibility requires compatibility_date = "2026-04-01" or later.
Getting Started
Dynamic Workers is in open beta for all paid Cloudflare Workers users. The API exposes two modes: load() for ephemeral execution (isolate destroyed after the call), and get() for persistent cached isolates suited to batch processing and long-running agents.
const dynamicWorker = await loader.load({
code: aiGeneratedCode,
bindings: { db: env.DB, kv: env.KV },
limits: { cpuMs: 100, memoryMB: 128 }
});
const result = await dynamicWorker.execute();
The Dynamic Workers playground on GitHub includes a deployable example demonstrating Code Mode with a live LLM integration. Full documentation is at developers.cloudflare.com/dynamic-workers.
One constraint worth knowing upfront: Dynamic Workers supports JavaScript and TypeScript only. If your agents need Python execution, E2B (Firecracker microVMs) is the right choice — slower cold starts (~100ms) but full language support and the highest security isolation. For container-compatible workloads, gVisor is a middle path. For long-running services in any language, Docker remains appropriate.
The container-versus-isolate debate will run for a while. But for AI agents running short-lived JavaScript workloads at scale, the math is already settled: V8 isolates are faster, cheaper, and operationally simpler. The Code Mode pattern is worth understanding regardless of runtime — the principle of letting LLMs write code against typed APIs instead of navigating tool registries applies broadly. Cloudflare just made it easy to run that code safely.













