
On August 26, 2026, every request to /v1/assistants, /v1/threads, and /v1/runs will return an error. OpenAI set this deadline a year ago and has explicitly stated there is no extension. If your application calls openai.beta.assistants or openai.beta.threads anywhere in its codebase, you have 86 days to fix that before your users start seeing errors.
This is not a deprecated-but-still-working situation. August 26 is a hard cutoff. The OpenAI Assistants API was always in beta — it never reached GA — and OpenAI is done maintaining it now that the Responses API has reached feature parity.
What Stops Working on August 26
The full list of endpoints that fail: all /v1/assistants endpoints, /v1/threads, /v1/threads/{id}/messages, /v1/threads/{id}/runs, and all run-step sub-endpoints. Every openai.beta.* namespace call in your code becomes broken.
If you’re using Azure OpenAI, the August 26 deadline is the same, but your migration path differs — you move to Azure AI Foundry Agents, not the Responses API. The same urgency applies; the destination is different. Check with your Azure account team for the specific transition path.
The New Architecture: How Responses API Works
The Responses API is not a renamed Assistants API endpoint. The mental model has changed. Here is the concept mapping you need:
| Assistants API | Responses API | Key Change |
|---|---|---|
| Assistant | Prompt | Dashboard-only creation, not API-creatable |
| Thread | Conversation | Durable ID, server-managed state |
| Run | Response | No polling loop — synchronous or streaming |
| Run Step | Item | Tool calls, outputs, messages unified |
| File Search | File Search | Same tool, works on Responses API |
| Code Interpreter | Code Interpreter | Available via Responses API, same usage |
The single most disruptive change: Prompts are created in the dashboard, not via API. If your code calls openai.beta.assistants.create() to provision a new Assistant per user, per tenant, or per session, that pattern no longer exists. You define Prompts once in the OpenAI platform dashboard and reference them by ID.
For simple use cases, the new API is genuinely cleaner. Compare the Assistants approach — create thread, add message, create run, poll until complete, retrieve messages — with the Responses equivalent:
response = client.responses.create(
model="gpt-5.5",
instructions="You are a helpful assistant.",
input="What’s the fastest way to reverse a string in Python?",
)
print(response.output_text)
No thread lifecycle. No polling. Just a response.
Your 7-Step Migration Checklist
- Audit your codebase. Search for
openai.beta.assistantsandopenai.beta.threads. Every match is a file you need to touch. Run this grep before planning your timeline — you need the scope first. - Recreate Assistants as Prompts. Navigate to Prompts in the OpenAI dashboard, recreate each Assistant’s instructions and tool configuration, and store the prompt ID in your config or environment variables.
- Update your generation calls. The simplest path: update the endpoint from
/v1/chat/completionsor Assistants runs to/v1/responses. For basic single-turn flows with no tools, this may be sufficient. - Decide on state management. Either (a) stateless — pass your conversation history as
inputitems on each call, or (b) stateful — use the Conversations API to create a persistent conversation ID that you reuse across sessions. - Plan your thread data migration. OpenAI will not provide tooling to migrate Threads to Conversations. Start creating Conversations for all new sessions now. Backfill old Thread data as needed — this needs to be in your migration plan, not an afterthought.
- Verify tool integrations. File search still works — vector stores are unchanged, with a slightly different attachment pattern. Function calling format has minor updates worth reviewing in the official Assistants migration guide.
- Shadow-test before full cutover. Run both APIs in parallel for a period. Compare outputs, latency, and costs. Gradual per-tenant rollout reduces risk significantly for multi-tenant applications.
Where It Gets Hard
For simple chatbot integrations, migration is a few hours of work. For production-grade systems, it is considerably more complex.
Multi-tenant applications that provisioned one Assistant per customer account need an architectural rethink. Prompts are global dashboard objects — you cannot create one programmatically per tenant. The standard approach is a single Prompt with tenant-specific instructions injected at runtime, or tenant configuration managed in your own database.
Applications with large Thread histories — support platforms, long-running agent workflows, multi-session user interactions — need a data migration strategy before the deadline. There is no bulk export tool for Threads; migration to Conversations requires manual transformation work. The OpenAI developer community has been direct about this: this is not a 1:1 migration. Production-grade integrations should budget 2–6 weeks of real engineering effort.
If your integration heavily uses run-status polling — checking every few seconds whether a run has completed — that pattern is gone. The Responses API is synchronous or streaming. Existing retry logic, run-status state machines, and run-completion webhooks all need to be removed or replaced.
What You Gain After Migration
The migration is real work, but the destination is genuinely better. New built-in tools on the Responses API include deep research, remote MCP server connections, and computer use — none of which existed on Assistants. File search adds metadata filtering. Latency is improved. Pricing is lower per token.
The mental model is also simpler once you are through the transition. No assistant objects to manage in the API, no thread lifecycle to babysit, no run-status polling loops to maintain. Send items in, get items back.
86 days sounds comfortable. It is not, for teams running complex integrations. Start the audit this week. The official Assistants migration guide is the authoritative reference. Start there, then scope your specific integration before committing to a timeline.













