OpenAI warned on December 11, 2025 that its next-generation AI models are likely to pose a “high” cybersecurity risk – meaning they could develop working zero-day exploits and assist with complex industrial intrusions. This is the first time a major AI company has publicly admitted their models will reach capability levels previously deemed too dangerous for public release. Moreover, the “high” risk level is second-highest in OpenAI’s Preparedness Framework, one step below “critical,” where models would be considered unsafe to deploy. Furthermore, OpenAI isn’t saying this might happen – they’re “planning as though each new model could reach high levels of cybersecurity capability.”
From 27% to 76% in 90 Days
The warning comes with concrete evidence: AI cybersecurity capabilities jumped 181% in just three months. Specifically, GPT-5 scored 27% on capture-the-flag challenges in August 2025, while GPT-5.1-Codex-Max hit 76% in November. Additionally, CTF challenges simulate real-world vulnerability discovery and exploitation, and a 76% success rate puts these models on par with experienced penetration testers.
If this trajectory continues, models will surpass human expert-level performance within months. Consequently, this democratizes advanced hacking – anyone with a ChatGPT subscription could potentially launch sophisticated attacks. As a result, the barrier to entry for cyberattacks is dropping to near-zero.
AI Doesn’t Sleep, and That’s the Problem
Beyond raw capability, there’s a more insidious threat: sustained autonomous operation. Fouad Matin, an OpenAI researcher, identified “the model’s ability to work for extended periods of time” as a key concern. In contrast to AI, human attackers need sleep, make mistakes under fatigue, and work within business hours.
Multi-stage attacks that would take human teams weeks could happen in hours, running 24/7 without breaks. Therefore, this changes the threat model fundamentally – defenders can’t match autonomous attacks with human SOC teams working shifts. The implication is clear: AI-powered defense becomes mandatory to counter AI-powered offense. Indeed, the arms race has officially begun.
Meet Aardvark: The AI Fighting AI
OpenAI’s answer to this threat? More AI, naturally. Announced on October 31, weeks before the cybersecurity warning, Aardvark is OpenAI’s GPT-5-powered security tool that scans codebases for vulnerabilities and proposes patches automatically. Notably, it boasts a 92% detection rate on benchmark repositories and has discovered 10+ CVEs in open-source projects.
The tool monitors code commits, validates vulnerabilities by attempting exploitation in a sandbox environment, then generates patches through OpenAI Codex integration. Currently, it’s in private beta with select partners.
However, here’s the irony: the same company creating models capable of developing zero-day exploits is selling you the tool to defend against them. In essence, OpenAI is both the arsonist and the firefighter, creating a convenient vendor lock-in where you need their defensive AI to counter the threats enabled by their offensive-capable AI. Ultimately, this isn’t a solution – it’s an arms race escalation dressed up as safety.
Self-Regulation Is Safety Theater
OpenAI’s defensive strategy includes the Frontier Risk Council, an advisory group of cyber defenders and security practitioners, plus access controls and tiered permissions for qualifying users. Nevertheless, none of this addresses the fundamental issue.
The council is advisory only – they can recommend, but OpenAI makes the final decisions on model releases. Indeed, there’s no veto power, creating a real risk of rubber-stamping versus genuine oversight. Meanwhile, access controls sound promising until you remember they can be bypassed through prompt injection attacks and jailbreaking techniques. As George Chalhoub from UCL notes: “There will always be some residual risks around prompt injections because that’s just the nature of systems that interpret natural language and execute actions.”
More fundamentally, tech industry self-regulation has failed repeatedly. For instance, social media failed to self-regulate misinformation and teen mental health impacts. Similarly, privacy protections only materialized after GDPR forced compliance. Likewise, crypto promised self-regulation and delivered scams, rug pulls, and exchange collapses. Every time, the industry asks for trust and delivers safety theater.
OpenAI’s admission is a step forward – transparency beats secrecy. However, self-regulation isn’t enough. We need external oversight with independent safety audits before high-risk model releases, mandatory disclosure of model capability levels with public reporting requirements, and an actual regulatory framework with binding rules instead of voluntary pledges. Ultimately, maybe some AI capabilities shouldn’t be in public models, even if that slows “progress.”
What Developers Should Do Now
AI-assisted attacks are either here or imminent. Consequently, security teams can’t wait for regulations. First, update your threat models immediately – assume attackers have AI-powered reconnaissance and exploitation tools. Specifically, traditional defenses against script kiddies won’t cut it against autonomous AI agents.
Second, don’t rely solely on AI to fix AI vulnerabilities. While tools like Aardvark are helpful, human security review remains essential. Furthermore, AI-generated patches need validation – they could contain subtle backdoors. Additionally, review your organization’s AI tool usage policies: Who has access to which models? Is GitHub Copilot being used for security-critical code? What data is being sent to AI APIs?
Third, prepare your defenses for AI-generated attacks with increased monitoring for automated patterns, rate limiting, anomaly detection, and behavioral analysis. Finally, implement zero-trust architecture and least privilege principles more strictly than ever – segment networks to contain AI-assisted lateral movement and limit AI model access to sensitive codebases.
This is the new normal. Indeed, the AI safety debate isn’t about whether models will reach dangerous capability levels – OpenAI just confirmed they will. Rather, the question is: who decides when we’ve gone too far? Right now, the answer is “the same companies building the models.” That should concern everyone.











