GitHub switched every Copilot Business and Enterprise organization to GPT-5.3-Codex as the default base model on May 17. No announcement email. No dedicated blog post. Just a changelog entry. If your organization never changed the model picker, you are already running on a different model than you were last week — and the difference is more meaningful than any of the billing drama GitHub has been generating.
The Base Model Is What Most Developers Actually Use
The base model is the silent default that runs everywhere: inline completions, agent mode, automatic code review on pull requests, Copilot chat in VS Code and on github.com, and GitHub Mobile. Most Copilot Business and Enterprise users have never opened the model picker. For them, the base model is not a configuration — it is the product.
Before May 17, that product was GPT-4.1. Now it is GPT-5.3-Codex, an agentic coding model that OpenAI launched on February 5, 2026. The March announcement of long-term support designation signaled this transition was coming; the 60-day rollout window they specified landed exactly on May 17.
The Benchmarks That Actually Matter
Most coverage of GPT-5.3-Codex leads with SWE-Bench Pro: 56.8%, up from 56.4% for GPT-5.2-Codex. That is a rounding error. Focus on the other two benchmarks.
| Benchmark | GPT-5.2-Codex | GPT-5.3-Codex | Change |
|---|---|---|---|
| SWE-Bench Pro | 56.4% | 56.8% | +0.4pp |
| Terminal-Bench 2.0 | 64.0% | 77.3% | +13.3pp |
| OSWorld-Verified | 38.2% | 64.7% | +26.5pp |
On Terminal-Bench 2.0 — which measures autonomous operation in terminal environments including file editing, git operations, build systems, and debugging — GPT-5.3-Codex scored 77.3%, up from 64.0%. That is a 13-point jump. On OSWorld-Verified, which tests agentic computer use across multi-step visual workflows, the score jumped from 38.2% to 64.7% — a 26-point leap in a single model generation. For reference, humans score around 72% on OSWorld tasks.
The pattern is clear: this model is dramatically better at doing things autonomously, not just suggesting them. If you have been underwhelmed by Copilot’s agent mode, the underlying model just got a significant upgrade. The 25% speed improvement on agentic tasks compounds meaningfully when agent mode makes 20–30 model calls per session.
The LTS Program Is an Enterprise Signal Worth Reading
GitHub established GPT-5.3-Codex as the first LTS (long-term support) model in Copilot, guaranteed available through February 4, 2027. On the surface, that sounds like a minor operational detail. It is not.
Enterprise security teams run new AI model versions through internal reviews before allowing them in production pipelines. Compliance departments need model behavior to be stable when auditing code generated with AI assistance. Custom tooling built on top of Copilot can break when the underlying model changes its output format or reasoning patterns.
By committing to a 12-month availability window for a specific model, GitHub is acknowledging that their previous pace of model cycling was a real problem for enterprise customers. The LTS designation is a product decision, not just a support policy — GitHub is competing on reliability, not just capability.
The June 1 Collision
There is a timing issue worth tracking. GPT-4.1 has been running at a 0x AI Credits multiplier — effectively free within the existing subscription. GPT-5.3-Codex runs at 1x, meaning it counts toward the new usage-based billing system that launches June 1.
Both changes converge on the same date: the default model becomes GPT-5.3-Codex (1x), and the billing model switches to consumption-based. GPT-4.1 remains available until June 1, at which point it deprecates alongside the old billing structure. Enterprise administrators should verify their organization’s model policy and estimate AI Credit consumption before that date.
What to Do Now
For individual developers, no action is required immediately. GPT-4.1 remains available in the model picker until June 1 if you need it. The practical advice is to actually use agent mode for complex multi-file work before dismissing it — the Terminal-Bench and OSWorld gains suggest a meaningfully different experience from what you may have tried earlier this year.
For enterprise administrators, the checklist is short: confirm your organization’s model policy is set before June 1, understand your team’s AI Credit allocation under the new plan, and check the supported models documentation for current availability. If your compliance process requires review before deploying a new default model, contact GitHub’s account team — they offered an extension window for exactly this situation.
The quiet changelog entry from May 17 will have more practical impact on most development teams than anything GitHub has announced loudly this month. The default always wins.













