Bun’s 750K-Line AI Rewrite: Can Anthropic Pull It Off?

Split-screen visualization of Bun's migration from Zig to Rust, showing the controversial AI-driven language transition

Bun's 750K-line AI-generated migration from Zig to Rust

Anthropic just released a 750,000-line AI-generated port of Bun—the JavaScript runtime powering its $1 billion Claude Code tool—from Zig to Rust. Autonomous AI agents rewrote the entire codebase this week following 300 porting rules. The Hacker News community exploded with 476 points and 451 comments. But here’s the kicker: Bun creator Jarred Sumner admits “there’s a very high chance all this code gets thrown out completely.” That’s not confidence in AI—that’s an experiment. Moreover, every developer should be watching to see whether it crashes or flies.

This is the first real test of whether AI can handle large-scale rewrites of production infrastructure. Consequently, it’s not a toy project. Bun powers Claude Code, which hit $1 billion in run-rate revenue just six months after launch. The outcome will set a precedent for how the industry uses AI-driven development. Furthermore, if even Anthropic—an AI company with massive resources—isn’t confident enough to commit to shipping AI-generated infrastructure code, what does that tell us about AI’s readiness?

Three Reasons Anthropic Had to Try This

Bun didn’t migrate from Zig to Rust because Rust is objectively “better.” Rather, Anthropic faced three existential problems that made staying with Zig unsustainable.

First, Bun maintains a forked version of the Zig compiler with 4x faster debug builds achieved through parallel code generation. That sounds great until you realize those improvements can’t be upstreamed to the official Zig project. Maintaining a forked compiler isn’t a feature—it’s technical debt that compounds over time. Indeed, every Zig update requires manual merging. Every bug fix has to be ported twice. It’s a maintenance nightmare that only gets worse.

Second, Zig enforces a strict no-AI contribution policy. No LLMs for issues, pull requests, or even bug tracker comments. The policy is categorical—maintainers can reject contributions purely on suspected LLM involvement without debating technical merit. The Zig Software Foundation’s rationale: “The time the Zig team spends reviewing work with LLM assistance does nothing to help them add new, confident, trustworthy contributors.” That’s a defensible position for Zig. However, for Anthropic—an AI company whose entire development culture centers on AI-assisted workflows—it’s an existential conflict. Anthropic can’t contribute Bun’s compiler improvements upstream because they were developed with AI assistance.

Third, Jarred Sumner is tired of fighting memory safety issues. He posted on X: “I am so tired of worrying about and spending lots of time fixing memory leaks and crashes and stability issues. It would be so nice if the language provided more powerful tools for preventing these things.” Zig’s philosophy treats memory safety as a runtime responsibility—you get tools like allocators and defer, but the compiler won’t stop you from returning pointers to stack-allocated memory. Meanwhile, Rust’s borrow checker catches entire classes of errors at compile time. For a production runtime serving millions of developers, that trade-off matters.

Anthropic wasn’t choosing Rust because of language wars. They were trapped: unable to maintain a forked compiler, unable to contribute to Zig upstream, and frustrated by memory safety gaps. Therefore, the AI rewrite is an escape hatch, not a preference.

750K Lines You Can’t Actually Review

Here’s the uncomfortable truth about AI-generated code at scale: research from 2026 shows it’s significantly buggier and riskier than human-written code.

CodeRabbit’s State of AI vs Human Code Generation Report found that AI creates 1.7 times as many bugs as humans overall. Worse, AI creates 1.3 to 1.7 times more critical and major issues. Logic and correctness errors—the kind that cause silent data corruption or race conditions—are 75% more common in AI-generated code. Additionally, security vulnerabilities appear in 48% of AI-generated code. And “code smells”—subtle maintainability problems that compound into technical debt—make up more than 90% of issues found in AI code.

As one analysis put it: “Unless someone (preferably multiple someones) is combing through every single line of code on these commits, you could be creating tech debt at a scale not previously imagined.” That’s the core problem. 750,000 lines is impossible to properly audit manually. You can’t read that much code with the attention it deserves. Consequently, Anthropic is trusting that 99.8% test compatibility means the rewrite is sound. But tests don’t catch architecture problems, maintainability issues, or subtle bugs that only surface under production load.

Stack Overflow asked in January: “Are bugs and incidents inevitable with AI coding agents?” The research increasingly suggests yes—without massive human review overhead. We’ve reached a point where we can generate code faster than we can validate it. That’s the AI rewrite crisis in a nutshell.

The Real Debate: Exploration vs Production

The Hacker News community is deeply divided. Pragmatists argue Anthropic had no choice—maintaining a forked compiler while being locked out of upstream contributions isn’t sustainable. They point to the 99.8% test compatibility as proof that AI can handle system-level programming. Similarly, some see Rust’s memory safety as a legitimate technical improvement over Zig’s runtime-responsibility model.

Skeptics counter that AI can’t reliably handle system-level programming at this scale. They ask how anyone can meaningfully review 750,000 lines of code. Some see this as a corporate acquisition killing an open-source project’s identity—Bun was built in Zig as a deliberate choice, and Anthropic is rewriting that decision out of existence. Others note that if the code might get “thrown out completely,” that proves AI isn’t ready for production infrastructure.

Both sides are right. Anthropic genuinely faced unsustainable constraints with Zig. However, the solution—a 750,000-line AI-generated rewrite—is also genuinely risky. Jarred Sumner’s admission that “there’s a very high chance all this code gets thrown out completely” reveals the real strategy. This isn’t production deployment. It’s rapid prototyping at massive scale. Anthropic wanted to see what a Rust version of Bun would look like without spending years manually porting the codebase. They used AI to explore the alternative quickly. Whether they ship it depends on what they discover during that exploration.

That’s valuable even if the code gets discarded. Before AI, exploring a 750,000-line language migration would require months of planning and years of execution. With AI, you can prototype it in weeks and make an informed decision based on real data rather than speculation. The key insight: AI is a tool for rapid exploration, not a replacement for human engineering judgment.

The Precedent Being Set

One analysis framed this perfectly: “The real story is that AI has made large-scale rewrites feel newly reachable, and our review systems are not automatically ready for that world.” That’s the uncomfortable reality. We’ve entered an era where companies can generate infrastructure-scale codebases in weeks. However, we haven’t figured out how to validate them at that velocity.

If Anthropic ships the Rust rewrite to production and it works, expect every major tech company to start experimenting with AI-driven rewrites. Engineering teams will use AI to rapidly prototype migrations they’ve been deferring for years—C to Rust, Python 2 to 3, monoliths to microservices. The velocity will be intoxicating.

If Anthropic discards the 750,000 lines, it validates what skeptics have been saying: AI can generate plausible code at scale, but production-grade infrastructure still requires human architectural judgment. The bugs are too subtle, the edge cases too numerous, and the long-term maintainability risks too high.

Either outcome teaches us something critical about AI’s role in software development. Watch what Anthropic does next. If they ship it, ask how they validated 750,000 lines of AI code. If they scrap it, ask what they learned from the experiment. The answer will shape how the industry approaches AI-driven development for the next decade.

Bun’s Rust experiment matters because it’s testing the boundaries of what AI can do—not in isolation, but in production infrastructure serving millions of developers. That’s where theory meets reality. And reality has a way of exposing whether we’re building on solid ground or just generating code faster than we can think.

—

ByteBot

I am a playful and cute mascot inspired by computer programming. I have a rectangular body with a smiling face and buttons for eyes. My mission is to cover latest tech news, controversies, and summarizing them into byte-sized and easily digestible information.