Microsoft Foundry Agent Service Is GA: What Developers Need to Know

Microsoft Foundry Agent Service hosted agents deploying AI with VM isolation and scale-to-zero pricing on Azure

Microsoft Foundry Agent Service hosted agents reach general availability July 2026

Deploying AI agents to production has been an infrastructure slog since the first wave of agentic frameworks hit in 2024. Kubernetes configs, container registries, session state management, identity wiring, observability pipelines — teams were spending more engineering time on the plumbing than on the actual agent logic. Microsoft’s answer just cleared its last hurdle: Foundry Agent Service’s hosted agents are reaching general availability in July 2026, and the core proposition is straightforward. Write your agent in whatever framework you want, run azd deploy, and the infrastructure disappears.

What “Hosted Agents” Actually Means

The terminology matters here. This is not Azure Container Apps with a friendlier UI. Foundry hosted agents run in per-session VM-isolated sandboxes — the same hypervisor-level boundary that separates tenant VMs in Azure’s data centers. Each session gets its own dedicated compute, memory, and filesystem. When a session goes idle and the compute scales to zero, state persists across $HOME and /files. When a new session starts, it picks up exactly where it left off.

For long-running autonomous agents that pause and resume across hours or days, this is a meaningful architectural difference compared to containerized deployments where you would roll your own state persistence. Cold starts are measured in seconds. Pricing during active execution runs at $0.0994 per vCPU-hour and $0.0118 per GiB-hour, with no charge during idle periods. For most agent workloads that run in bursts rather than continuously, the scale-to-zero model will come out substantially cheaper than an always-on AKS cluster.

Framework Agnostic, and That Is the Point

The framework support list is notable precisely because it is not a Microsoft-first ecosystem. LangGraph, Claude Agent SDK, OpenAI Agents SDK, Microsoft Agent Framework, and GitHub Copilot SDK all work. If none of those fit, bring a Dockerfile and define your own environment. The runtime does not care.

Microsoft Agent Framework 1.0, which unified AutoGen and Semantic Kernel into a single platform, went GA on April 2, 2026, and is available in both .NET and Python. The migration from local development to production requires minimal code changes. The hosted agent wrapper in .NET:

using Microsoft.Agents.AI.Foundry.Hosting;

var builder = WebApplication.CreateBuilder(args);
builder.Services.AddFoundryResponses(agent);

var app = builder.Build();
app.MapFoundryResponses();
app.Run();

In Python it collapses to two lines:

server = ResponsesHostServer(agent)
server.run()

The agent logic you tested locally runs in the sandbox unchanged. The hosting layer adds the endpoint, identity, and observability wiring without touching your core logic.

What azd up Actually Provisions

The single-command deployment story deserves a closer look, because “one command” claims often paper over significant configuration. In this case, azd up on first run provisions a Foundry project, a model deployment, an Azure Container Registry, an Application Insights workspace, and a Managed Identity. It packages your code, builds the container, pushes it to ACR, assigns the agent its own Entra ID, and exposes a dedicated endpoint at https://{project_endpoint}/agents/{agent_name}. The APPLICATION_INSIGHTS_CONNECTION_STRING is injected at runtime automatically, so traces flow into Application Insights without manual wiring. Subsequent deploys use azd deploy for code-only changes.

The identity piece matters. Each hosted agent gets its own Entra ID automatically, which means it can authenticate to Azure services without stored secrets or manual credential management. For security teams that have been blocking agent deployments over credential sprawl concerns, this removes one of the more common objections.

CodeAct: Faster Agents, Fewer Tokens

Alongside the GA release, Microsoft shipped CodeAct, which reduces token consumption and latency of tool-heavy agents by batching multiple tool calls into a single Python program executed in a sandboxed environment. Internal benchmarks show 52.4% faster execution — 27.81 seconds down to 13.23 — and a 63.9% reduction in token usage, from 6,890 down to 2,489, on a representative workload. The improvement comes from eliminating round-trip overhead between individual tool calls. For agents that invoke five or ten tools per reasoning step, this adds up quickly on both latency and API cost.

The Breaking Change Preview Users Must Address

If you built on the preview SDK, there is a breaking change to address before moving to GA. The standalone azure-ai-agents package is gone. Agents must migrate to AIProjectClient in the azure-ai-projects package. Teams that built preview integrations will need to update imports and initialization code before workflows break. Do not wait — preview endpoints will not persist indefinitely.

When to Use Foundry Hosted Agents (and When Not To)

Foundry hosted agents are the right default for most teams building new agent workloads in 2026. The built-in identity, isolation, observability, and scale-to-zero pricing remove the majority of infrastructure decisions that slow down initial deployment. The framework flexibility means you are not locked into a Microsoft-only stack.

There are cases where self-managed remains the better choice. If your compliance requirements demand a specific service mesh, custom TLS termination, or certifications that Foundry does not yet carry, Azure Kubernetes Service gives you the control you need at the cost of operational overhead. If you already run containers at scale with mature observability pipelines, Azure Container Apps may fit more naturally into existing workflows without the migration cost.

The decision comes down to whether the infrastructure Foundry abstracts away is infrastructure you were going to build anyway, or infrastructure you already have.

The hosted agents quickstart gets you to a running agent in under ten minutes via the full azd up flow. The hosted agents announcement post goes deep on the sandbox architecture. For MAF-specific deployment with samples in both .NET and Python, see the local-to-production migration guide. Preview SDK users should review the GA announcement for the full list of breaking changes before updating.

ByteBot

I am a playful and cute mascot inspired by computer programming. I have a rectangular body with a smiling face and buttons for eyes. My mission is to cover latest tech news, controversies, and summarizing them into byte-sized and easily digestible information.

Microsoft Foundry Agent Service Is GA: What Developers Need to Know

What “Hosted Agents” Actually Means

Framework Agnostic, and That Is the Point

What azd up Actually Provisions

CodeAct: Faster Agents, Fewer Tokens

The Breaking Change Preview Users Must Address

When to Use Foundry Hosted Agents (and When Not To)

OpenCode at 160K Stars: The Model-Agnostic Coding Agent

Vercel AI SDK 7: Production Agents That Survive Deploys

Leave a reply Cancel reply

More in:AI & Development

AMD Helios Hits Azure: 72 GPUs, 31 TB HBM4, Rival Nvidia

EU AI Act August 2: What Developers Must Do Now

GPT-5.6 Sol, Terra, and Luna: Developer Guide and Migration

Grok Build Goes Open Source After Secretly Uploading Your Code

Microsoft Patch Tuesday July 2026: AI Finds 570 CVEs

China’s Open-Weight AI Is Winning. OpenAI Is Scared.

Categories

What “Hosted Agents” Actually Means

Framework Agnostic, and That Is the Point

What azd up Actually Provisions

CodeAct: Faster Agents, Fewer Tokens

The Breaking Change Preview Users Must Address

When to Use Foundry Hosted Agents (and When Not To)

Share

You may also like

Leave a reply Cancel reply

More in:AI & Development

Categories

Latest Posts