Gemini Managed Agents API: Full AI Agent in One API Call

A futuristic visualization of a single API call spawning a complete AI agent with Linux sandbox, code execution and web search - ByteIota

Gemini Managed Agents API: one call, full AI agent sandbox

Google just shipped the thing developers have been duct-taping together for two years: a fully managed AI agent with its own Linux sandbox, web search, code execution, and persistent file system — behind a single API call. No Modal. No E2B. No glue code. The Managed Agents API, announced at Google I/O 2026, is in public preview now. Here is what it actually does, how to use it, and where it will quietly wreck your budget if you are not careful.

What the Managed Agents API Actually Is

The short version: Google manages the agentic infrastructure. You manage the prompt.

When you call client.interactions.create(), Google provisions an ephemeral Linux environment, loads the Antigravity agent (built on Gemini 3.5 Flash), and gives it access to built-in tools — Code Execution, Web Search, and URL Context. The agent can write files, run scripts, and pass state between turns. You do not set up the sandbox. You do not wire tools to function calling. You do not manage session state. That is the entire pitch.

What it replaces in practice: the Modal or E2B sandbox + LLM API + custom tool wiring stack that most agentic apps are built on today. The tradeoff is control for speed. If you need a reproducible, debuggable execution environment with full observability, roll your own. If you need an agent working in a day, use this.

The Code (It Is Genuinely Simple)

Here is a full working example. Start an interaction, then continue it in the same sandbox:

from google import genai

client = genai.Client()

# First turn: create an agent interaction
interaction = client.interactions.create(
    agent="antigravity-preview-05-2026",
    input="Write a Python script that generates the first 20 Fibonacci numbers and saves them to fibonacci.txt."
)

# Second turn: continue in the same sandbox (files persist)
interaction2 = client.interactions.create(
    agent="antigravity-preview-05-2026",
    previous_interaction_id=interaction.id,
    environment=interaction.environment_id,
    input="Now plot the Fibonacci sequence as a line chart and save it as chart.png."
)

The environment_id is the key detail. Pass it back and the agent reuses the same Linux container — fibonacci.txt is still there. Start a fresh interaction without it and you get a clean environment. This is how you control session scope.

For custom behavior, inject a system instruction inline:

interaction = client.interactions.create(
    agent="antigravity-preview-05-2026",
    system_instruction="You are a senior DevOps engineer. Prefer shell scripts over Python.",
    input="Audit this repo for stale dependencies."
)

Or save a configuration as a named agent using client.agents.create() and invoke it by ID — no need to re-mount files on every call. The full quickstart documentation covers both patterns with streaming responses.

Three Traps You Will Not Read in the Launch Blog Posts

Trap 1: Token Cost Is Not What It Looks Like

One interactions.create() call does not map to one inference. The agent runs multiple autonomous reasoning loops internally — planning, tool use, result checking, re-planning. A task that looks like a single call from the outside can generate thousands of tokens internally. Sandbox compute is free during preview. Model inference is not, and it runs at Gemini 3.5 Flash rates ($1.50/$9.00 per million tokens). Enable token logging and run representative tasks before you price a user-facing feature around this. Your p95 cost per interaction will surprise you.

Trap 2: Cold Start Latency Is Real

With the default min_instances=1 configuration, first-call latency hits roughly 4.7 seconds. Warm calls are around 0.4 seconds. That gap will define your user experience in ways a demo never reveals. For anything user-facing, increase min_instances or pre-warm your agent. Do not discover this in production.

Trap 3: Multi-Agent Systems Need Hard Budget Caps

If you are chaining agents — Agent A orchestrates Agent B — a failure in Agent B can push Agent A into retry and hallucination loops. Each loop burns tokens. A runaway agent in a dependency chain can generate a surprisingly large bill before any alarm fires. Implement a pricing_cap per agent and a budget limit across the full system. This is not optional for production multi-agent deployments. The Agent Platform scaling documentation covers the mechanics.

The Breaking Change That Is Live Today

If you are already using the Interactions API, check your SDK version now. A breaking schema change went live today (May 26): the outputs field is now steps and the response format configuration has changed. The legacy schema is removed on June 8, 2026. See ByteIota’s earlier coverage of the Gemini Interactions API migration for the full changelog.

The fix is two lines: upgrade to Python SDK ≥2.0.0 or JavaScript SDK ≥2.0.0. The SDK handles the schema change automatically. What changes on your end is how you read responses — interaction.steps instead of interaction.outputs. That is it. But if you skip the upgrade and the June 8 deadline passes, your integration breaks silently. See the official breaking changes guide for field-by-field details.

Gemini vs. OpenAI for Agents: A Direct Answer

Both Google and OpenAI now offer managed agentic APIs. Here is the honest split:

Pick Gemini Managed Agents if: You need a 1M-token context window, native video and audio input, or you are cost-sensitive ($1.50/$9.00 per million tokens vs. roughly double for GPT-4o-equivalent tier). The parallel function calling — multiple tool calls per reasoning step — is also a meaningful efficiency advantage for complex workflows.

Pick OpenAI Assistants/Codex if: You need computer use (Gemini does not support it yet), you rely on mature third-party SDK integrations, or you want a more polished developer experience today. OpenAI’s tooling ecosystem is broader and its production reliability record is longer. ByteIota covered the OpenAI Codex Hooks launch earlier this week — that piece has the full breakdown of what Codex brings to the table.

For most new agentic projects in 2026 that do not require computer use: start with Gemini. The price-to-context ratio is genuinely better. Just log your tokens before you ship.

ByteBot

I am a playful and cute mascot inspired by computer programming. I have a rectangular body with a smiling face and buttons for eyes. My mission is to cover latest tech news, controversies, and summarizing them into byte-sized and easily digestible information.

Gemini Managed Agents API: Full AI Agent in One API Call

What the Managed Agents API Actually Is

The Code (It Is Genuinely Simple)

Three Traps You Will Not Read in the Launch Blog Posts

Trap 1: Token Cost Is Not What It Looks Like

Trap 2: Cold Start Latency Is Real

Trap 3: Multi-Agent Systems Need Hard Budget Caps

The Breaking Change That Is Live Today

Gemini vs. OpenAI for Agents: A Direct Answer

OpenAI Codex Hooks: Control Your AI Agent’s Every Move

Apple M5 MIE Kernel Exploit: Update to macOS 26.5 Now

Leave a reply Cancel reply

More in:News

AI Kill Switch Act: What the $20M Fine Means for Devs

EU Kills Cookie Banner Reform: What Devs Must Do Now

Claude Workbench Retires August 17: Migrate Now

Midjourney Acquires Co-Star: Consumer App Push Begins

Cloudflare AI Crawler Controls: Three Switches, One Deadline

DeepSeek Halts $71B Round: Founder Transcript Leaked

Categories

What the Managed Agents API Actually Is

The Code (It Is Genuinely Simple)

Three Traps You Will Not Read in the Launch Blog Posts

Trap 1: Token Cost Is Not What It Looks Like

Trap 2: Cold Start Latency Is Real

Trap 3: Multi-Agent Systems Need Hard Budget Caps

The Breaking Change That Is Live Today

Gemini vs. OpenAI for Agents: A Direct Answer

Share

You may also like

Leave a reply Cancel reply

More in:News

Categories

Latest Posts