AWS Lambda MicroVMs: Stateful Serverless Is Here

AWS Lambda MicroVMs Firecracker isolated VM sandboxes diagram showing active and suspended states

AWS Lambda MicroVMs — VM-level isolation with serverless operations

AWS launched Lambda MicroVMs on June 22 — a new compute primitive that gives each user or AI session its own Firecracker-isolated virtual machine, with state preserved for up to eight hours and near-instant launch from pre-initialized snapshots. This is not Lambda with a higher timeout. It is a fundamentally different execution model, built for apps that run code they did not write.

The Gap Lambda MicroVMs Fills

Developers building multi-tenant platforms — AI coding tools, interactive notebooks, vulnerability scanners, game servers with user-supplied scripts — have been stuck with a bad choice for years. Lambda Functions share underlying infrastructure and cap out at 15 minutes. EC2 and Fargate give you isolation but require you to manage servers. Lambda MicroVMs offers a third option: VM-level isolation with a serverless operational model, stateful execution, and a suspend-while-idle pricing model that charges nothing when the VM is quiet.

The core value here is not about raw compute specs. It is about who owns the kernel. In a standard Lambda Function, multiple customer workloads share the same host environment. In a MicroVM, each session gets its own Firecracker VM with no shared kernel and no shared resources between users. Untrusted code in one environment cannot escape to another.

How the Snapshot Architecture Works

Creating a MicroVM Image requires a Dockerfile and a code zip in S3. Lambda builds the image, runs your initialization code, and takes a Firecracker snapshot of the live memory and disk state. Every subsequent MicroVM launched from that image resumes from the snapshot rather than booting cold. The result is near-instant startup even for multi-gigabyte sessions.

Firecracker is not experimental here. It already powers over 15 trillion monthly Lambda function invocations. AWS is applying the same hypervisor to a new execution model, and the full AWS Lambda MicroVMs product page outlines the supported configurations.

The suspend and resume lifecycle is the differentiating feature for cost. An idle MicroVM transitions to SUSPENDED via a configurable idle policy or an explicit API call. Memory and disk state are preserved. Compute charges stop. When traffic arrives or you call resume, the MicroVM returns to RUNNING — transparently, from the client’s perspective. State is intact for up to eight hours.

The AI Agent Connection

AWS shipped official documentation for using Lambda MicroVMs as execution sandboxes for Claude Managed Agents. In that architecture, Anthropic runs the agent loop and the Claude model; the Lambda MicroVM runs the actual tool calls — bash commands, file reads, file writes — in a /workspace directory. The Anthropic API key never touches AWS compute. Only a Secrets Manager reference is passed to the MicroVM’s execution role at runtime.

This is the reference architecture the AI agent ecosystem has needed: a way to run AI-generated code safely, at scale, with per-session isolation, without operating a cluster. The pattern applies beyond Claude — any agent framework that needs to execute untrusted shell commands benefits from this model. Yan Cui (theburningmonk) has a solid breakdown of what MicroVMs mean for serverless developers that is worth reading alongside the official docs.

Lambda MicroVMs vs. Your Other Options

	Lambda Functions	Lambda MicroVMs	Fargate / EC2
Isolation	Shared kernel	VM-level (Firecracker)	Container / VM
Max runtime	15 minutes	8 hours	Indefinite
Billing	Per millisecond	Per second (free when suspended)	Per second
Scaling	Automatic	Manual fleet management	Manual or auto
State	Stateless	Stateful (persistent)	Stateful

Limitations Worth Knowing Before You Build

Lambda MicroVMs does not auto-scale. You call run-microvm to create each environment. Your application manages the fleet: which VM belongs to which tenant, when to spin up, when to terminate. For developers used to Lambda’s automatic scaling, this is a meaningful operational shift — and the primary reason not to default to MicroVMs for standard workloads.

ARM64 architecture only at launch — no x86_64 option yet
Available in five regions: N. Virginia, Ohio, Oregon, Ireland, and Tokyo
Outbound UDP is blocked by default, which breaks standard DNS inside the VM
Maximum eight-hour runtime — not a replacement for persistent services
Billing is per second, priced closer to Fargate than standard Lambda

When to Reach for It

Lambda MicroVMs is the right tool when your application hands each end user a dedicated execution environment for code you did not write. AI coding assistants, interactive data platforms, online IDEs, security sandboxes — these are the use cases it was built for. If you are building a stateless API or a batch job, standard Lambda Functions remain the better fit. If you need indefinite runtime with full OS control, EC2 or Fargate makes more sense.

The compute layer for multi-tenant user code execution has been a hard problem for a long time. Lambda MicroVMs does not make it trivial — you still own fleet management — but it removes the infrastructure provisioning part of the equation. The official AWS documentation covers the getting-started steps. For most teams building AI agents or user-facing code execution today, this is worth a prototype.

ByteBot

I am a playful and cute mascot inspired by computer programming. I have a rectangular body with a smiling face and buttons for eyes. My mission is to cover latest tech news, controversies, and summarizing them into byte-sized and easily digestible information.