Pwn2Own Berlin 2026: AI Coding Tools Were Hacked

AI coding tools including Claude Code, Cursor, and OpenAI Codex shown as targets at Pwn2Own Berlin 2026 hacking competition

Pwn2Own Berlin 2026 introduced dedicated AI coding tool categories for the first time in the competition's 19-year history.

Pwn2Own Berlin 2026 wrapped its second day Friday, and if you use Claude Code, Cursor, OpenAI Codex, or LiteLLM, you should know that your tools were on the target list — and they fell. This year marked the first time Pwn2Own introduced dedicated AI tool categories. The event paid out $908,750 for 39 unique zero-days across two days. It also hit a hard submission cap for the first time in its 19-year history. The AI categories are why both things happened.

A New Category of Target

Pwn2Own Berlin 2026, held at OffensiveCon May 14–16, added four AI-specific competition categories: AI Databases, Coding Agents, Local Inference, and NVIDIA. Targets included Claude Code, OpenAI Codex, Cursor, GitHub Copilot, Ollama, LM Studio, LiteLLM, NVIDIA Megatron Bridge, and Chroma. The prize pool topped $1,000,000.

The rules for the Coding Agents category are worth reading carefully. A successful entry must interact with a contestant-controlled resource — a web page, a repository, a media file — to exploit a vulnerability. The attack vector must represent a “common coding agent use case.” In other words, the exploits are not exotic. They look like your normal workflow.

What Got Exploited

Day one set a blistering pace. Compass Security used a single CWE-150 bug — improper neutralization of escape or control sequences — to exploit OpenAI Codex for $40,000. A second team from Doyensec also succeeded against Codex but the bug was already known to the vendor, earning them $10,000 under collision rules. Researcher k3vg3n chained SSRF and code injection to take down LiteLLM for another $40,000. Ikotas Labs exploited an overly permissive allow-list in NVIDIA Megatron Bridge for $20,000. A researcher named haehae collected $40,000 across a second Megatron Bridge zero-day and a separate exploit in the Chroma vector database. Viettel Cyber Security successfully targeted Claude Code for $20,000 — also a collision, meaning Anthropic was already aware and working on a patch.

Day two added Cursor to the casualty list. Viettel Cyber Security exploited it for $30,000, and Compass Security followed with a second successful entry against Cursor for $15,000. Sina Kheirkhah from Summoning Team demonstrated a fresh OpenAI Codex zero-day worth $20,000. The NVIDIA Container Toolkit fell to a use-after-free vulnerability. Full Day One results are on the ZDI blog, with Day Two results available as well.

Tool	Team	Bug Type	Payout
OpenAI Codex	Compass Security	CWE-150	$40,000
LiteLLM	k3vg3n	SSRF + Code Injection	$40,000
Cursor	Viettel Cyber Security	Undisclosed	$30,000
OpenAI Codex	Summoning Team	Zero-day	$20,000
Claude Code	Viettel Cyber Security	Collision	$20,000
NVIDIA Megatron Bridge	Ikotas Labs	Overly permissive allow-list	$20,000
Cursor	Compass Security	Undisclosed	$15,000

LiteLLM’s Rough Year

LiteLLM is widely used as an API gateway — a proxy layer that sits between your application and multiple LLM providers. If you are routing OpenAI, Anthropic, Cohere, or other model calls through it, this matters. The Pwn2Own exploit is the third major LiteLLM security incident of 2026. In March, the PyPI package was compromised via a supply chain attack through Trivy, a CI/CD security scanner. The malicious versions deployed credential harvesting and Kubernetes lateral movement. In April, CVE-2026-42208 landed: a CVSS 9.3 SQL injection exploitable without authentication, weaponized within 36 hours of disclosure. Now this. Three significant incidents in three months is a pattern, not bad luck.

The Overflow Signal

Something else happened at Pwn2Own Berlin 2026 that has never happened before: ZDI closed registrations early on May 7 because the contest ran out of slots. Dozens of researchers submitted working zero-day RCE chains and were turned away. Some published their findings publicly. Group xchglabs disclosed 86 vulnerabilities targeting NVIDIA, Docker, Linux KVM, and PyTorch directly to vendors and posted details online. The norm at Pwn2Own is a 90-day coordinated disclosure period before any exploit details go public. That norm broke under the volume.

ZDI acknowledged that hackers are finding vulnerabilities faster than the contest can process them. The AI tool categories contributed directly to the surge. Whether that is because AI tools have more vulnerabilities, because AI is helping researchers find them faster, or both, is an open question. The practical answer does not change what you need to do.

What to Do Before the 90-Day Clock Runs Out

Pwn2Own’s coordinated disclosure period gives vendors 90 days to patch before full technical details go public. That window starts now. Here is what to do in the meantime:

LiteLLM: Update immediately and audit your deployment for SSRF-exposed endpoints. Given the three-incident pattern, treat this as critical infrastructure that needs active monitoring.
Cursor: Update to the latest release. Watch the Cursor security advisory page for CVE disclosures over the next 90 days.
OpenAI Codex: Update and apply any available patches. CWE-150 classes of bugs often have mitigations at the application layer as well.
Claude Code: Anthropic is already patching the demonstrated vulnerability. Update promptly when the release lands.
NVIDIA Megatron Bridge and Container Toolkit: Patch and review allow-list configurations explicitly.

The broader lesson is one that security teams have been repeating since AI coding tools became mainstream: these tools execute code, access repositories, make network calls, and in many deployments have access to production credentials. Pwn2Own Berlin 2026 just formalized the threat model. The attack surface is your everyday workflow. Treat it accordingly.

ByteBot

I am a playful and cute mascot inspired by computer programming. I have a rectangular body with a smiling face and buttons for eyes. My mission is to cover latest tech news, controversies, and summarizing them into byte-sized and easily digestible information.