AI-generated code now accounts for 42% of what ships to production — and it’s introducing security findings at 10x the rate of human-written code. Security teams that once reviewed roughly 1,000 findings a month are now staring at 10,000-plus, most of them false positives from tools that can’t reason about business logic. Cognition’s answer is Devin Security Swarm, launched this week: parallel AI agents that scan your entire codebase, validate whether each vulnerability is actually exploitable in a live sandbox, and write the remediation PRs themselves — for around $90 a scan.
How Devin Security Swarm Works
Devin Security Swarm uses an architecture Cognition calls Agentic MapReduce, adapted from the distributed computing pattern of the same name. The process runs in five stages.
First, a planning agent analyzes the repository and generates selectors — deterministic patterns that identify which files are relevant to the threat model. Those selectors run across the entire codebase to produce a set of signals, which then get bucketed into shards. Crucially, non-matching files are eliminated entirely, concentrating token spend on code that actually matters.
From there, independent parallel agents each take one shard. They read the actual code, reason across files to catch chained logic flaws and authentication bypasses that span service boundaries, and flag findings with severity ratings and confidence scores. A reducer then deduplicates across all shards, identifies cross-shard attack chains, and applies global prioritization. Finally, for high-confidence findings, sandboxed sessions reproduce each vulnerability against a running build to confirm exploitability before a remediation PR is ever opened.
The design matters because traditional static analysis tools match patterns without reasoning about whether a vulnerability is actually reachable. Devin Security Swarm validates findings at runtime. The result is fewer false positives and a PR queue that reflects real risk, not scan noise.
The Benchmarks
Cognition published results against a dataset of 50 real-world vulnerabilities drawn from GitHub Security Advisories, spanning 14 languages and repositories ranging from 60 KB to 92 MB. The evaluation used CVEs published after model training cutoffs, ruling out any memorization advantage.
| Tool | Recall | Cost per Scan |
|---|---|---|
| Devin Security Swarm | 72% | $90.23 |
| Claude Security | 68% | $131.87 |
| Codex Security | 48% | $118.20 |
| Cursor Security | 26% | $4.60 |
Devin Security Swarm leads on both axes — higher recall and lower cost per finding than the next-best alternative. More telling: it uniquely identified three critical vulnerabilities that every other tool missed, including a PHP sandbox bypass and a Spring Kafka deserialization flaw. The $4.60 Cursor Security option gets you 26% recall — less than one in four vulnerabilities found.
Why the Timing Makes Sense
The vulnerability backlog problem didn’t come from nowhere. CVEs attributed to AI-generated code went from 6 in January 2026 to 35 in March, and researchers estimate the true number is 5 to 10 times the detected figure. According to the Cloud Security Alliance’s AI-generated CVE surge report, Escape.tech scanned 1,400 vibe-coded production applications and found 65% had security issues. The CSA called it a direct consequence of AI-assisted development velocity outpacing review cadence.
ByteIota covered this dynamic last month — AI-assisted developers ship 3-4x more PRs but triple their production incidents. Security findings are the delayed billing for that acceleration. However, what’s changed is that the gap between code shipped and code reviewed has widened faster than anyone anticipated.
The honest framing here is that AI tools created this backlog. Devin Security Swarm is an AI tool designed to clear it. That’s either the natural evolution of the ecosystem or a race condition where the cure keeps pace with the disease, depending on your read. What’s undeniable is that human security teams were not designed to review 10,000 findings a month — something automated has to fill that gap.
Getting Started
A free trial is available at devin.ai/security. Connect a repository, configure your scan profile (Devin can generate one from your threat model), and set a schedule — daily, weekly, or custom. No per-repo CI configuration is required. Moreover, subsequent scans process only changed code, so costs drop after the first full pass.
For teams with existing CVE backlogs, Cognition offers a six-week Enterprise Vulnerability Remediation Program where their engineering team embeds with yours to drive the backlog toward zero, then transitions to continuous scanning. Cognition, which raised $1 billion at a roughly $26 billion valuation in May, counts Goldman Sachs, the US Army, Mercedes-Benz, and Citi among Devin’s customers. Security Swarm is a direct extension of that platform — and a telling sign of where AI coding tool security is heading in 2026.













