
On December 22, 2025, OpenAI published a rare corporate admission: prompt injection attacks on AI browsers “may never be fully solved.” In a blog post about hardening its Atlas AI browser, OpenAI compared prompt injection to scams and social engineering—threats that persist despite ongoing defenses. This marks a significant concession from a major AI company that a fundamental security flaw in AI-powered browsers may be inherently unfixable.
What Prompt Injection Is and Why It Can’t Be Fixed
Prompt injection is a vulnerability where attackers hide malicious instructions in web pages or emails that AI agents process. Unlike traditional software bugs that can be patched, prompt injection exploits how large language models fundamentally work. LLMs interpret natural language as instructions and cannot reliably distinguish between trusted commands from users and malicious commands from external content.
OpenAI demonstrated this in their blog post. Their automated attacker placed a hidden malicious instruction in an email. When the AI agent scanned the inbox, it followed those instructions and sent a resignation message instead of drafting the requested out-of-office reply. The AI did the opposite of what the user asked.
Traditional web security mechanisms like Same-Origin Policy and CORS are completely useless here. The AI agent itself reads content from all domains as part of normal operation. Security researcher Charlie Eriksen from Aikido Security puts it bluntly: “We’re trying to retrofit one of the most security-sensitive pieces of consumer software with a technology that’s still probabilistic, opaque, and easy to steer in subtle ways.”
Autonomy × Access = Dangerous Risk Profile
Rami McCarthy from cybersecurity firm Wiz explains the risk equation: “A useful way to reason about risk in AI systems is autonomy multiplied by access.” AI browsers occupy a dangerous position—moderate autonomy combined with very high access to sensitive data like email, passwords, and payment information.
Real-world incidents in 2025 prove this risk is real. In March, a Fortune 500 financial services firm discovered their customer service AI had been leaking sensitive account data for weeks, costing millions in regulatory fines. In January, an enterprise RAG system was exploited to leak proprietary business intelligence and execute API calls with elevated privileges.
McCarthy concludes: “For most everyday use cases, agentic browsers don’t yet deliver enough value to justify their current risk profile. The risk is high given their access to sensitive data like email and payment information.” This explains why Gartner issued an advisory in December 2025 recommending organizations block AI browsers entirely.
Related: AI IDE Security Crisis: 30+ Flaws Expose Cursor, Copilot
Fighting Fire With Fire: OpenAI’s Automated Attacker
OpenAI’s approach to this unfixable problem is to build an “LLM-based automated attacker” trained with reinforcement learning to discover vulnerabilities before real attackers do. The company describes this bot as capable of steering agents into “sophisticated, long-horizon harmful workflows that unfold over tens (or even hundreds) of steps.”
During reasoning, the automated attacker proposes candidate injections and tests them in a simulator. When successful attacks are found, OpenAI uses them to train more resilient models. The company recently shipped a security update prompted by a new class of attacks uncovered through this internal red teaming.
However, this reveals a fundamental limitation. Since they can’t prevent prompt injection, they’re focusing on “continuous hardening” through faster patch cycles and large-scale testing. This is a perpetual arms race—the automated attacker finds new vulnerabilities, they patch, then attackers find different vulnerabilities. It’s a never-ending cycle.
Expert Consensus: Block AI Browsers Now
OpenAI isn’t alone in this assessment. The U.K.’s National Cyber Security Centre warned that prompt injection attacks “may never be totally mitigated.” Industry analysts at Gartner went further, issuing an advisory titled “Cybersecurity Must Block AI Browsers for Now,” observing that “default AI browser settings prioritize user experience over security.”
OWASP ranked prompt injection as the #1 AI security risk in its 2025 Top 10 for LLMs, noting it appears in over 73% of production AI deployments assessed during security audits. George Chalhoub from UCL Interaction Centre states: “There will always be some residual risks around prompt injections because that’s just the nature of systems that interpret natural language and execute actions.”
The consensus is clear across government agencies, industry analysts, and academic researchers. Prompt injection is a fundamental challenge, not a temporary bug. Organizations and users need to accept this risk or avoid AI browsers entirely.
Key Takeaways
- Prompt injection may never be fully solved—it exploits how LLMs fundamentally work, not a patchable bug
- AI browsers combine moderate autonomy with high access to sensitive data, creating a dangerous risk profile that outweighs current benefits for most use cases
- OpenAI’s defense strategy relies on continuous hardening through automated red teaming, but this is a perpetual arms race with no end in sight
- Industry experts recommend blocking AI browsers for now—Gartner, UK NCSC, and OWASP all warn against deployment in production environments
- Users should avoid giving AI agents access to email, banking, or sensitive systems until fundamental security improvements are made
This admission raises existential questions about AI browsers before they’ve even reached mainstream adoption. If users can’t trust AI to follow their instructions, the core value proposition collapses. The industry faces a choice: accept unfixable vulnerabilities and limit AI browser use to low-stakes tasks, or fundamentally redesign AI agent architectures.










