
Apple just shipped a 3-billion-parameter language model to every Apple Intelligence device — and handed developers the keys. The Foundation Models framework, available now in the iOS 26 beta, gives Swift apps direct access to the same on-device model powering Apple Intelligence. No API key. No cloud roundtrip. No per-token bill. With WWDC 2026 opening June 8, this is the framework to learn before Apple locks in the API.
What This Is — and What It Isn’t
Set expectations early: the Foundation Models framework is not a drop-in replacement for ChatGPT or Claude. The ~3B parameter on-device model specializes in three things: text understanding, structured output generation, and tool calling. It will not reason through math problems, search the web, or match a frontier model’s breadth of knowledge. It will hallucinate. It has a limited context window.
What it does well: classify text, extract structured data from messy input, summarize notes, power autocomplete, and call your app’s Swift functions based on conversational context. The developers who will get the most from this are the ones who treat it as a structured output engine — not a general-purpose assistant.
The Basic API: Ten Lines to Your First Response
The entry point is LanguageModelSession. Create one, call respond(to:), get a result:
import FoundationModels
let session = LanguageModelSession()
let response = try await session.respond(to: "Summarize: \(noteText)")
print(response.content)
Sessions are stateful — each respond call appends to an internal transcript, so multi-turn conversations work without manual history management.
One thing you must handle before shipping: not every device has Apple Intelligence enabled. The check is a single guard:
guard LanguageModelSession.isAvailable else {
// Gracefully degrade for unsupported devices
return
}
Skipping this will crash on anything older than an iPhone 15 Pro or an M1 Mac. Handle it.
The @Generable Macro: Typed Output, Not String Parsing
The genuinely useful feature is guided generation. Apply the @Generable macro to any Swift struct and the framework uses constrained decoding to force the model’s output into your type — at the token level, not after the fact. You get a typed Swift value back, not a JSON blob you have to parse and hope validates.
Pair @Generable with @Guide to give the model field-level instructions:
@Generable
struct BugReport {
@Guide(description: "One-line summary of the bug")
let title: String
@Guide(description: "Severity: critical, high, medium, or low")
let severity: String
@Guide(.count(3))
let reproductionSteps: [String]
}
let session = LanguageModelSession()
let report = try await session.respond(
to: "Parse this support ticket: \(rawTicketText)",
generating: BugReport.self
)
// report.title is a typed String — always
// report.reproductionSteps is exactly 3 elements — always
This eliminates an entire class of “the model returned invalid JSON again” bugs. If you’ve ever written a do { try JSONDecoder().decode(...) } catch block to handle model hallucinations, you know exactly what problem this solves.
Streaming works the same way. streamResponse(to:generating:) returns an async sequence where each element is a partially populated struct — useful for animating fields as they generate rather than waiting for the full response.
Tool Calling Bridges the Intelligence Gap
The model can call Swift functions you define. Implement the Tool protocol, pass your tools to the session, and the model decides when to invoke them based on context. Give it access to your database, calendar, user preferences, or any app data — and the framework handles sequential and parallel call graphs automatically.
This turns the Foundation Models framework from a neat structured output engine into genuinely useful in-app AI. The model cannot search the web, but it can query your data and reason over it. That distinction matters for what you build.
The Case for This vs. Calling OpenAI
The economics of consumer AI apps are brutal. Call a cloud API on every user interaction and you’re running a subsidy program. The Foundation Models framework changes the math:
| Foundation Models | OpenAI API | |
|---|---|---|
| Cost per inference | $0 (on-device) | Per token |
| Latency | Near-zero | 200–2,000ms |
| Privacy | Never leaves device | Sent to cloud |
| Distribution | Ships with iOS 26 | User needs account |
| Best use | Structured extraction | Complex reasoning |
The tradeoff is capability — frontier models win on complex reasoning and broad knowledge. But for structured extraction, classification, and in-app assistance tasks, the ~3B on-device model is sufficient. And the economics become decisive at consumer scale.
Beta Caveats: Build Now, Ship After WWDC
The API changes between betas. LanguageModelSession‘s surface has shifted across iOS 26 betas — code that compiled last week may not compile today. Build familiarity with the patterns now, but wait for the post-WWDC GM release before shipping to production.
Other constraints: text-only input for now, Apple’s Acceptable Use Requirements govern what the model will generate, and nine languages are supported at launch with more coming by year end. The model requires Apple Intelligence — iPhone 15 Pro or newer, or M1+ Mac/iPad.
Get Started Before June 8
WWDC 2026 kicks off June 8. Apple will finalize the API, add capabilities, and ship session videos that will shape how the community builds with this framework. Getting hands-on now means you’ll ask better questions during the sessions and ship faster afterward.
Install Xcode 26 beta, enable Apple Intelligence on a compatible device, and build something with it. The official Foundation Models documentation is solid. The “Meet the Foundation Models framework” WWDC25 session covers the fundamentals in 25 minutes. And Rudrank Riyam’s example repo on GitHub has working code across multiple use cases — a better starting point than blank Xcode.
The on-device AI era for iOS is here. The question is whether you’re building on it before WWDC or after.













