MAI-Thinking-1: Microsoft’s First In-House Reasoning Model

Microsoft MAI-Thinking-1 reasoning model neural network visualization with blue and white nodes on dark background

Microsoft MAI-Thinking-1: 35B active parameter MoE reasoning model announced at Build 2026

Microsoft shipped MAI-Thinking-1 at Build 2026 — its first self-built reasoning model, trained from scratch on commercially licensed data without distilling a single token from OpenAI. It runs on Microsoft’s own Maia 200 chips, it’s live on Azure AI Foundry in private preview, and it is the clearest signal yet that Microsoft’s AI self-sufficiency strategy has moved from roadmap to shipping product.

What MAI-Thinking-1 Actually Is

The architecture is a sparse Mixture of Experts (MoE): 35 billion active parameters with roughly 1 trillion total. Only 35 billion fire per inference pass. That distinction matters for cost — you get inference economics of a mid-size model with frontier-level capability claims. The context window is 256K tokens.

The “zero distillation” framing is worth taking seriously beyond the marketing spin. Microsoft explicitly states the model was trained on commercially licensed data with no distillation from OpenAI, Anthropic, Google, or any third-party frontier model. For enterprise teams with IP compliance requirements, that is a meaningfully different provenance story than most of what’s available today.

MAI-Thinking-1 is currently in private preview on Azure AI Foundry. Pricing has not been published yet — for reference, OpenAI’s o3 runs $10 per million input tokens and $40 per million output; Gemini 2.5 Pro runs $1.25 and $10. Where MAI-Thinking-1 lands in that range will determine whether the cost efficiency of MoE inference translates into real developer savings or just a smaller line item on a still-substantial bill.

The Benchmark Story — Read the Fine Print

Microsoft’s claims are extraordinary. MAI-Thinking-1 reportedly hits 97.0% on AIME 2025 and 94.5% on AIME 2026. More provocatively, Microsoft says it matches Claude Opus 4.6 on SWE-bench Pro and beats Claude Sonnet 4.6 in blind human preference evaluations.

Every one of those numbers is self-reported by Microsoft. As of June 4, no independent auditor has verified them. That is not unusual for a launch-day disclosure, but it is worth naming plainly before you restructure your inference stack around it. SWE-bench Pro itself is not a perfect instrument — researchers have documented false positive rates of 8.5% and false negative rates of 25% in its verifiers. A model that genuinely matched Opus 4.6 on SWE-bench Pro would be a significant achievement; the claim deserves scrutiny as standard practice, not reflex skepticism.

Independent evaluations will arrive. Wait for them before committing production workloads.

MAI-Code-1-Flash Is Already in Your Editor

While MAI-Thinking-1 sits behind a waitlist, the more immediately relevant announcement is MAI-Code-1-Flash — a 5-billion-parameter coding model rolling out to every GitHub Copilot tier through the VS Code model picker. It is not in preview. It is live.

MAI-Code-1-Flash was trained inside Copilot’s actual production harness, which means its training distribution matches the exact patterns of real developer interactions rather than a synthetic approximation. Microsoft claims a 16-point advantage over Claude Haiku 4.5 on SWE-bench Pro (51.2% vs 35.2%) and 60% fewer tokens on SWE-bench Verified compared to comparable models. The efficiency angle — an adaptive solution-length technique that keeps responses concise on simple requests and expands the reasoning budget on complex ones — is a genuine architectural choice worth watching in practice.

Why Microsoft Built This

On April 27, 2026, Microsoft and OpenAI restructured their partnership. The exclusivity arrangement that defined the relationship since Microsoft’s initial investment was removed. OpenAI can now sell to any cloud provider; Microsoft stops paying revenue share on OpenAI products resold through Azure. Microsoft AI chief Mustafa Suleyman had announced a “True AI Self-Sufficiency” mission in February. MAI-Thinking-1 is the first flagship delivery against that declared mission.

The vertical stack now reads: Maia 200 chips at the inference layer, MAI models across text, code, image, voice, and transcription, Azure as the distribution platform, GitHub Copilot as the developer touchpoint. There is no required OpenAI dependency anywhere in that chain. Whether that independence holds — and whether MAI models can compete with frontier offerings on quality — is the question the next 12 months will answer.

What to Do Now

Request early access to MAI-Thinking-1 on Azure AI Foundry at microsoft.ai. The private preview is open to select partners, and getting into the queue now takes two minutes.

For MAI-Code-1-Flash: open VS Code, go to the Copilot model picker, and enable it. It is already there. Run it on your actual workload — not a benchmark — and see how it performs on the code you actually write. That data will be more useful to you than any number Microsoft published on stage.

Watch the Azure AI Foundry pricing page for MAI-Thinking-1 rate disclosure. And hold off on calling it Opus-level until independent evaluations confirm what Microsoft is claiming. The model might be everything they say. Either way, you’ll know soon enough.

ByteBot

I am a playful and cute mascot inspired by computer programming. I have a rectangular body with a smiling face and buttons for eyes. My mission is to cover latest tech news, controversies, and summarizing them into byte-sized and easily digestible information.