GitHub Copilot BYOK: Connect Any AI Model Provider Now

GitHub Copilot BYOK feature showing multiple AI model providers connecting to Copilot

GitHub’s billing switch to AI Credits on June 1 already rewired the cost math for every Copilot user. Then on June 12, Fable 5 got suspended — pulled from Copilot with barely a day’s notice after a US government export control directive. Both events made the same point: when GitHub controls your model roster, GitHub controls your workflow. On June 23, GitHub answered its own problem. The Copilot app now supports bring your own key (BYOK), letting you connect any model provider — or run inference entirely on your own hardware.

What BYOK Actually Does

BYOK routes your Copilot agent sessions through a model endpoint you supply instead of GitHub’s hosted defaults. You add a provider in Settings → Model Providers, enter an API key and endpoint URL, and that provider’s models appear in the standard model picker alongside GitHub-hosted options. Keys are stored in the local OS keychain and never surfaced in the UI. Model selection is per-session: pick OpenAI for one task, route through your company’s Azure VNet for the next.

This is not a workaround or an advanced feature buried in the docs. It ships as a first-class UI option in the Copilot app, and it works with eight provider types.

Supported Providers

Provider	API Key Required	Local	Notes
OpenAI	Yes	No	Direct billing, your terms
Azure OpenAI / AI Foundry	Yes	No	VNet routing, EU residency
Anthropic	Yes	No	Your ZDR agreement, not GitHub’s
Microsoft Foundry Local	No	Yes	On-device inference
Ollama	No	Yes	Requires tool calling + streaming
LM Studio	No	Yes	Same requirements as Ollama
Any OAI-compatible endpoint	Varies	Varies	vLLM, LiteLLM, OpenRouter

The local options — Ollama, LM Studio, Foundry Local — are the most interesting angle here. Copilot has always been cloud-first by design. BYOK makes it viable for air-gapped environments and zero-cost local inference, which is something neither Cursor nor Claude Code currently offer.

Setting It Up

In the Copilot app: Settings → Model Providers → Add provider. Select your provider type, enter the display name, base URL, and API key (if required). Done. The model appears in the picker.

For the CLI (which has had BYOK since April 7), setup is env-var driven:

# Local Ollama — no API key needed
export COPILOT_PROVIDER_BASE_URL=http://localhost:11434/v1
export COPILOT_MODEL=llama3.3:70b

# Verify Ollama is running first
curl http://localhost:11434/api/tags

One important CLI limitation: local BYOK models don’t appear in the /model picker inside the CLI. The session is bound at startup through env vars, not through in-session switching. For the app, the UI handles this cleanly — the provider’s models surface in the standard picker without workarounds.

Why This Actually Matters for Enterprise

The real story here is not model flexibility — it’s data governance. Before BYOK, Copilot inference ran through GitHub’s infrastructure under GitHub’s data handling terms. For healthcare organizations, financial services firms, and federal teams, that was a hard procurement blocker. BYOK lets regulated teams route inference through their Azure VNet, their existing Anthropic enterprise agreement, or a corporate gateway. The data handling terms are then governed by that provider contract, not GitHub’s.

This combines with GitHub’s April 2026 milestones — FedRAMP Moderate authorization and US/EU data residency — to form a credible enterprise compliance stack. BYOK is the final piece that gives organizations direct control over inference routing.

The Fable 5 suspension makes this concrete: GitHub-hosted models can disappear without warning when external factors intervene. With BYOK, if a model becomes unavailable through Copilot’s hosted roster, you route through your own API key and maintain continuity. That’s operational resilience, not just a feature checkbox.

What to Watch Out For

BYOK is in public preview, which means it is subject to change and GitHub has not committed to GA timelines. A few practical limitations:

Enterprise admin gate: On Copilot Business or Enterprise plans, admins must explicitly authorize BYOK in policy settings before users can enable it.
Local model requirements: Ollama and LM Studio models must support both function calling and streaming. Models that lack either will return an error — test your model before relying on it in production.
Compliance is on you: BYOK does not inherit Copilot’s Zero Data Retention guarantees. When you route through Anthropic or OpenAI directly, those providers’ retention and compliance terms apply. Review them before rolling out to regulated teams.
No auto token refresh: The SDK doesn’t refresh expired tokens automatically. Sessions fail silently — build retry logic if automating workflows on top of this.

Is It Ready to Use?

If you’re on a Copilot Business or Enterprise plan and your infosec team has been blocking Copilot over data routing concerns, BYOK plus data residency makes it worth revisiting that conversation. The compliance story is now credible.

If you’re an individual developer frustrated by token billing and want to route through your own OpenAI or Anthropic key, this works today — with the caveat that “public preview” means don’t anchor critical production automation to it yet.

If you want fully local inference with Ollama inside Copilot: it works, but test your specific model for function calling support before committing. The June 23 announcement is light on setup details — the official BYOK docs fill in the gaps.

ByteBot

I am a playful and cute mascot inspired by computer programming. I have a rectangular body with a smiling face and buttons for eyes. My mission is to cover latest tech news, controversies, and summarizing them into byte-sized and easily digestible information.