
On August 26, 2026, every request your app makes to /v1/assistants, /v1/threads, or /v1/runs stops working. Not degrades. Not rate-limits. Fails. The OpenAI Assistants API goes dark on a fixed calendar date, the announcement has been live since August 2025, and if you are still running on it in production, you have 77 days to migrate to the Responses API before your app breaks.
What the Assistants API Was and Why So Many Built on It
OpenAI launched the Assistants API at DevDay in November 2023 alongside GPT-4 Turbo. The pitch was simple: build a stateful AI assistant without managing any of the conversation plumbing yourself. You created an Assistant, dropped messages into a Thread, fired a Run, polled until it completed, and retrieved the output. It was high-level, relatively intuitive, and handled file search and code execution out of the box. Millions of production apps — customer support bots, coding assistants, internal tools, document workflows — were built on it.
The problem was the “beta” label in every endpoint URL, which most developers quietly ignored. OpenAI deprecated the entire product on August 26, 2025, giving one year of notice. That notice period ends in 77 days.
This Is Not a Model Deprecation — It Is an Architecture Change
It is worth being precise here, because the migration scope is different from the model retirements you may have handled before. When GPT-4o was retired, you updated a model name string and moved on in an afternoon. This is not that.
The Assistants API is a stateful, async product. Runs were asynchronous processes that executed against Threads — you had to poll for completion, then retrieve messages separately. All of that is gone. The Responses API is a fundamentally different architecture: synchronous by default, stateless by default, with state management as an explicit opt-in. For a simple chatbot wrapper, migration is one to four engineering weeks. For a multi-tenant production system with stored Thread history, custom Assistant configurations per customer, and complex tool orchestration, it is months of work. OpenAI is not providing an automated migration tool for Threads to Conversations. You will need to backfill manually.
The Responses API: What You Gain
Here is the good news: the Responses API is genuinely better. The async polling loop is gone — you get output items directly in the response body. Execution is faster. And critically, you get capabilities the Assistants API never had:
- MCP tool calling — connect to any remote MCP server as a tool, opening up integrations that would have required custom function calling before
- Computer use — agents that can interact with browsers and desktop applications
- Deep Research — the o3-deep-research and o4-mini-deep-research models for multi-step web research with citations
- Simpler state management — pass
previous_response_idto chain responses; OpenAI stores context server-side for 30 days
The Conversations API pairs with the Responses API when you need a durable persistent session object — closer to what Threads provided in the old model.
The Conceptual Map
The entity mapping between the two APIs is not one-to-one, but it is navigable:
| Assistants API | Responses API / Conversations API |
|---|---|
| Assistant | Prompt (system instructions per request) |
| Thread | Conversation |
| Run | Response |
| Run Steps | Items |
| Async poll loop | Synchronous response body |
| File Search Tool | File Search tool (same interface) |
| Code Interpreter | Code Execution tool |
Note one important billing detail: even when you use previous_response_id to chain responses, all previous input tokens are billed as fresh input tokens on each call. If your Assistants integration relied on long Thread histories being handled efficiently, model the token costs carefully before assuming 1:1 parity.
How to Start the Migration
If you have not started, the practical steps are:
- Audit your Assistants API surface. Enumerate every Assistant configuration, every active Thread, and every place in your codebase that calls an Assistants endpoint. The scope of this audit determines your timeline.
- New traffic first. Stop creating new Threads immediately. Route all new conversations to the Responses API using the Conversations API for state. Do not migrate old Threads — build forward and backfill selectively if conversation history matters.
- Migrate Assistant configurations to system prompts. Each Assistant you had is now a set of system instructions you pass per request or store in a Prompt object. This is usually the fastest part of migration.
- Rework tool integrations. Code Interpreter maps cleanly to Code Execution. File Search is similar. If you used custom function calling, test carefully — the Responses API tool call lifecycle is different from the Assistants Run model.
OpenAI has an official migration guide that is the right starting point. The OpenAI Deprecations page lists every endpoint and model with its retirement date — worth bookmarking if you run anything on OpenAI’s platform.
The Harder Conversation About Platform Risk
The Assistants API launched in November 2023 as a “beta” and shut down less than three years later. GPT-4 ran for roughly three years before API retirement. GPT-4.5 lasted five months. The fine-tuning API announced its wind-down in May 2026. Reusable Prompts and Agent Builder were deprecated in June. OpenAI has retired more APIs and models in 2026 than in all prior years combined, and the pattern is accelerating.
This is not a criticism of OpenAI specifically — every platform that ships at high velocity creates this kind of deprecation debt. But if your production application has deep, untested dependencies on any single vendor’s “beta” API, you are accepting platform risk that most engineering teams have not priced into their maintenance estimates. The right architectural response is abstraction: frameworks like the Vercel AI SDK, LangChain, or LlamaIndex sit between your application logic and the provider API, which means a provider-level deprecation becomes a library update rather than a migration project.
August 26 is the immediate problem. The underlying pattern is the one worth solving.













