
Microsoft opened Build 2026 today by shipping four new MAI models — MAI-Thinking-1, MAI-Image-2.5, MAI-Voice-2, and MAI-Transcribe-1.5 — all available through Microsoft Foundry. After five years and more than $13 billion invested in OpenAI, Microsoft is now telling the world it can run its own AI stack. That is the real story buried inside this announcement.
MAI-Thinking-1: Microsoft’s First Reasoning Model
The headliner at Build 2026 is MAI-Thinking-1, Microsoft’s first dedicated reasoning model, announced by AI chief Mustafa Suleyman during the keynote. The important technical detail here is what it is not: it was built without model distillation. That means Microsoft did not train it on outputs from GPT-4o, Claude, or Gemini. It is a ground-up reasoning build.
Why does that matter? Most competing reasoning models — including several that launched in the past six months — use distillation from a larger frontier model. That is cheaper and faster, but it also creates a silent dependency: your model’s ceiling is the model you distilled from. Microsoft’s no-distillation claim is a direct answer to the skeptics who asked whether MAI models were just repackaged OpenAI. MAI-Thinking-1 is at least architecturally independent.
Benchmark numbers are not published yet. Expect them on the Foundry model leaderboard shortly. The model is available now for enterprise developers through Azure AI Foundry.
GitHub Copilot Is Switching to MAI in August
This is the announcement with the widest developer impact, even though it requires zero action on your part. Microsoft confirmed that a MAI coding model will replace GPT-4 Turbo as GitHub Copilot’s default engine starting August 2026. The routing happens invisibly: Microsoft selects the best model per task without surfacing it to users.
For teams using Copilot through the API, nothing changes in the interface. But if you have been benchmarking Copilot’s output quality against specific models, that baseline is moving. Keep an eye on output quality in August; this is worth testing against your own codebase.
MAI-Image-2.5: Image Editing Finally Arrives
The previous MAI-Image-2 was a text-to-image model. MAI-Image-2.5 changes that by accepting image uploads, which opens editing workflows — not just generation. The model ships in two variants: a high-quality version and MAI-Image-2.5e, a faster option designed for production pipelines where latency matters more than maximum fidelity.
Microsoft claims MAI-Image-2.5 ranks in the top three on the Arena text-to-image leaderboard. The model was already live on Arena before Build, so the ranking is independently verifiable. Prior pricing context: MAI-Image-2 runs at $5 per million text-input tokens and $33 per million image-output tokens, with the Efficient variant at $19.50 per million image-output tokens. MAI-Image-2.5 pricing is not yet published.
Speech Models: MAI-Transcribe-1.5 and MAI-Voice-2
These are upgrades to models Microsoft launched quietly in April 2026, which already outperformed their competitors. MAI-Transcribe-1 holds the lowest word error rate on the FLEURS benchmark across 25 languages — 3.8 percent average — and beats OpenAI’s Whisper-large-v3 on all of them at $0.36 per hour of audio. It was built by a team of ten engineers.
MAI-Voice-1 generates 60 seconds of audio in under a second on a single GPU. Both the 1.5 and 2.0 upgrades focus on multilingual expansion and production efficiency. If you are building voice or transcription pipelines on Azure, the April models are worth evaluating now; the Build upgrades will follow shortly.
How to Access MAI Models
All MAI models are accessible through Microsoft Foundry with a standard Azure subscription. The MAI Playground offers no-code testing but is currently US-only. MAI-Transcribe-1, MAI-Voice-1, and MAI-Image-2 have been generally available since April 2026. The four new Build 2026 models — Thinking-1, Image-2.5, Voice-2, and Transcribe-1.5 — are launching this month.
What This Actually Means
Microsoft did not leave OpenAI. The $13 billion investment is still in place, and Azure AI Foundry still surfaces OpenAI models alongside MAI. What Microsoft built is leverage: a complete, first-party model stack that covers reasoning, image generation, voice synthesis, and speech-to-text. That means Microsoft no longer has to wait for OpenAI to release a better transcription model. It can ship its own.
For developers on Azure, the practical upside is more model choice within the same platform. You can compare MAI-Image-2.5 against DALL-E 3 in Foundry, run cost benchmarks, and pick based on your workload. The strategic battle between Microsoft and OpenAI plays out in their executive boardrooms; what lands in your environment is a wider model catalog at competitive pricing.













