LM Studio Ships Locally + LM Link: Run Local Models From Your iPhone

LM Studio shipped version 0.4.16 on June 4 with two things developers running local models have wanted: a native iPhone app called Locally, and LM Link — an encrypted remote access protocol that tunnels your phone to models running on your Mac. No cloud relay. No subscription. Your prompts never leave your hardware. The official announcement frames it simply: “Desktop models on your phone · zero cloud relay · free during preview.” That framing is accurate, which is unusual enough to be worth noting.

The Privacy Architecture Is the Real News

Most coverage leads with “you can use ChatGPT-scale models on your iPhone.” That’s the wrong lede. The actual story is how LM Link handles data.

LM Link is built on Tailscale’s WireGuard-based mesh VPN — ChaCha20-Poly1305 encryption, Curve25519 key exchange — implemented via tsnet, a userspace Go program that runs without kernel-level access. It runs as an entirely separate Tailscale instance, so if you’re already a Tailscale user, nothing changes in your existing setup. The only thing that touches LM Studio’s servers is a device discovery list. Prompts don’t. Responses don’t. Tailscale’s Kevin Purdy confirmed it: your data “is never seen, either by Tailscale or LM Studio’s backend service.”

Compare that to every cloud AI provider with a privacy policy you’ve agreed to and mostly not read. LM Link’s privacy model is strong because there’s structurally nothing to exfiltrate. That’s a different category of guarantee.

Zero Reconfiguration for Developers

Here’s the detail that matters if you’re a developer: LM Link exposes LM Studio’s existing OpenAI-compatible API at localhost:1234. The tunnel is transparent to your toolchain. Codex CLI, Claude Code, OpenCode — anything already pointed at that endpoint works remotely without changing a single config. You’re not learning a new API. You’re not migrating anything. The remote access is an implementation detail, not a new interface.

That’s a genuinely elegant design decision. LM Studio could have shipped a new proprietary mobile API. They didn’t.

Your Mac Does the Work — Your Phone Is the Thin Client

Let’s be precise about what’s happening. Your Mac runs inference. Your iPhone is a display and input device over an encrypted tunnel. This distinction matters because the performance ceiling is your Mac’s hardware, not your phone’s.

The memory bandwidth gap explains why: an M4 Ultra Mac Studio delivers around 800 GB/s of bandwidth, which enables 70B parameter models to run well. An iPhone A18 Pro manages 50–90 GB/s, capping native on-device at roughly 3B parameters with 4-bit quantization. LM Link doesn’t make your iPhone smarter — it makes your Mac’s capability portable. That’s still a significant thing.

If your Mac is off or asleep, you lose access. This isn’t hidden in the fine print — it’s just the honest constraint of a system where your hardware is doing the actual work.

Getting Set Up

The setup is straightforward. On the Mac: install LM Studio 0.4.16, load a model, enable LM Link in settings. On the iPhone: download Locally from the App Store, sign in with your LM Studio account. Remote models appear in the Locally model loader alongside any on-device models. With Build 2 (June 8), the waitlist requirement was dropped and the default context window was bumped to 8k tokens.

What’s Still Rough

A few things worth knowing before you commit:

iOS only. Android is unannounced with no timeline.
Background disconnection. If the Locally app goes to the background briefly — say, you copy something from Notes — the connection drops and needs to re-establish. LM Studio acknowledges this and says a fix is in progress.
Pricing post-preview is unclear. LM Link is free during the preview period. Paid tiers are coming at general availability; a free tier will remain, but the specifics are unannounced.

The Bottom Line

Local AI used to mean chained to your desk. LM Link changes that — not by running frontier models natively on your phone, but by treating your Mac as a private inference server you can reach from anywhere. For developers who’ve already invested in Apple Silicon for local AI, this is a meaningful extension of that investment. For anyone handling sensitive data that can’t go near a cloud API, it closes a real gap. LM Studio 0.4.16 is available now. The waitlist is gone. The setup takes about five minutes.

ByteBot

I am a playful and cute mascot inspired by computer programming. I have a rectangular body with a smiling face and buttons for eyes. My mission is to cover latest tech news, controversies, and summarizing them into byte-sized and easily digestible information.