Andrej Karpathy coined “vibe coding” in 2025 to describe AI-assisted development where you describe what you want, accept what the model generates, and iterate fast. One year later, he’s pivoting to “agentic engineering” to make it sound more professional. Yesterday (May 6, 2026), Simon Willison published a warning that the line between casual AI usage and professional engineering is dangerously blurring. The timing is notable: Amazon just lost 6.3 million orders in a single outage tied to its AI coding assistant, forcing a 90-day safety reset. Open source maintainers are closing their gates because AI-generated spam is drowning them. Renaming “vibe coding” to “agentic engineering” doesn’t fix the fundamental problems—it just makes sloppy practices sound respectable.
Amazon Lost 6.3 Million Orders to AI Code
On March 5, 2026, Amazon’s North American marketplaces saw a 99% drop in orders, losing 6.3 million transactions. Two days earlier, another incident caused 120,000 lost orders and 1.6 million website errors. Both failures were directly linked to Amazon Q, the company’s AI coding assistant. This isn’t a theoretical concern about code quality—it’s millions of dollars evaporating because AI-generated code shipped without adequate review.
Amazon’s response reveals what the company actually thinks about trusting AI code. They implemented a 90-day safety reset across 335 critical systems, mandating two-person code reviews, formal documentation, and stricter automated checks before deployment. That’s the opposite of Karpathy’s vision where you’re “99% orchestrating agents” while “acting as oversight.” When the stakes are real, Amazon hit the brakes and demanded rigorous human review. If one of the world’s largest tech companies can’t safely deploy AI-generated code at scale, what makes anyone think smaller teams will fare better?
Maintainers Are Shutting Down Because of AI Spam
Three prominent open source maintainers have closed or severely restricted external contributions due to AI-generated submissions. Daniel Stenberg banned AI security reports from cURL after they hit 20% of all bugs—mostly garbage that wasted maintainer time. Mitchell Hashimoto implemented zero-tolerance for unattributed AI code in Ghostty, permanently banning contributors who submit low-quality AI output. Steve Ruiz closed all external pull requests to tldraw, explaining “it’s easier to vibe code fixes than clean up AI-generated PRs.”
The pattern is consistent: all three maintainers use AI tools themselves for their own development. However, the problem isn’t AI as a tool—it’s the flood of submissions from developers who don’t understand the code they’re contributing. They’re vibe coding contributions to projects they haven’t actually studied, generating superficial fixes that create more work than they solve. This is breaking open source sustainability. For more on this trend, see InfoQ’s coverage of the AI-driven open source crisis. When maintainers choose to close their projects rather than drown in AI spam, everyone who depends on open source software loses.
Related: AI Code Review Bottleneck: 11.4 Hours Reviewing vs 9.8 Writing
No AI Has a Professional Reputation to Lose
Simon Willison, Django co-creator and respected developer, admits the distinction between vibe coding and agentic engineering is blurring in his own practice—and it troubles him. Moreover, his core insight cuts to the accountability gap: “Claude Code does not have a professional reputation! It can’t take accountability.” When AI generates code, who’s responsible for bugs, security flaws, or outages? The AI company disclaims liability. The developer who accepted the code often didn’t thoroughly review it. The organization that deployed it assumed the developer had.
Willison describes a pattern of normalization of deviance. Each successful unreviewed deployment increases the tendency to trust AI output without verification. What starts as “this simple JSON API is safe to auto-accept” gradually expands to more critical code. Furthermore, he can “knock out a git repository with a hundred commits and a beautiful readme…in half an hour”—but from the outside, it looks identical to carefully-crafted work. The evaluation problem is unsolvable: you can’t distinguish between code someone wrote and understood versus code someone accepted from an AI without checking.
Same Process, Different Branding
Karpathy defines agentic engineering as “you are not writing code directly 99% of the time—you are orchestrating agents who do, while acting as oversight.” The terminology shift attempts to elevate the practice by emphasizing expertise and professional judgment. However, the fundamental process hasn’t changed: describe what you want, accept what comes back, iterate when it breaks. The problems that plagued vibe coding remain unsolved under the “agentic engineering” label.
Testing is theater. AI can generate tests, but who validates that the tests are comprehensive? Sixty-five percent of developers report AI assistants “miss relevant context” (Qodo 2025 survey). Security remains a blind spot. Research confirms “zero agentic AI systems are secure against prompt injection attacks.” Quality degrades over time through context rot as AI includes stale information in later outputs. Consequently, cross-file changes create subtle misalignment that only surfaces in specific scenarios. If you’re 99% orchestrating agents, as Karpathy suggests, you’re prompt engineering, not software engineering. Engineering requires understanding what you’re building, not just describing what you want.
Related: Developer AI Trust Crisis: Stack Overflow 2025 Survey
What Engineering Actually Requires
Amazon’s emergency response shows what real engineering discipline looks like when you can’t trust code without verification. Two people review every change before deployment. Formal documentation and approval processes are mandatory. Automated checks become stricter, not looser. This is engineering: rigor, validation, accountability. In fact, it’s the opposite of accepting AI output because it “looks good” or passed the tests the AI also wrote.
The industry can call it “agentic engineering” if that helps adoption. Nevertheless, terminology doesn’t create discipline. You’re either applying rigorous review, comprehensive testing, and security validation—or you’re vibe coding and hoping for the best. The Amazon incidents, the open source maintainer exodus, and the unsolved security vulnerabilities all point to the same conclusion: changing the name doesn’t change the outcome. Real engineering with AI tools means treating AI output as untrusted code that requires human understanding and validation. Anything less is still vibe coding, no matter how professional it sounds.











