
The AI that writes your code can now audit it in real time. Anthropic shipped its free security-guidance plugin for Claude Code on May 27, 2026, and it does exactly what it says: watches every edit, diff, and commit in your session and sends security findings back to Claude before you see the output. One command installs it. It works on every plan. There is no separate billing line.
This matters because the risk profile of AI-assisted development has a specific shape. Claude Code writes fast — that is the point — but speed means more code shipped before anyone reviews it. Anthropic’s own internal rollout found over 500 vulnerabilities in production open-source codebases that had gone undetected for years. The plugin is a direct response to that reality: shift security left, into the session, before a commit lands.
How the Plugin Works: Three Layers
The security-guidance plugin does not run a single all-or-nothing scan. It operates in three escalating layers, each catching what the one before it cannot.
Layer 1 — Pattern matching, per edit. This layer is instant and free — no model call, no token cost. On every file edit Claude makes, it runs a regex sweep against approximately 25 high-risk patterns: eval(), pickle.load() on untrusted data, os.system(), child_process.exec(), raw innerHTML= assignments, and hardcoded secrets matching patterns like sk_live_ or AKIA. If anything matches, you know immediately.
Layer 2 — LLM diff review, end of turn. After each Claude turn, the plugin computes a full git diff of everything changed and sends it to a separate Claude review — Opus 4.7 by default. This runs in the background; Claude’s response is not delayed. The model-backed pass catches what regex cannot: authorization bypasses, server-side request forgery (SSRF), insecure direct object references (IDOR), and weak cryptography. When it finds something, Claude is re-prompted with the findings and fixes them in the same session before you see the reply.
Layer 3 — Agentic commit review. When Claude executes a git commit or git push through its Bash tool, a deeper agentic review launches. It uses Read, Grep, and Glob to trace data flow across related files — not just the diff. This is how multi-file vulnerabilities get caught: cross-file IDOR, authorization bypass chains, SSRF patterns that span multiple modules. Context-awareness keeps false positives low.
Each layer costs more but catches more. Together they form a development-time security pass that does not require leaving the terminal.
Install in One Command
Inside any active Claude Code session, run:
/plugin install security-guidance@claude-plugins-official
You need Claude Code CLI version 2.1.144 or later and Python 3.8 or newer on your system path. On first run, the plugin creates a virtual environment at ~/.claude/security/ and installs the Claude Agent SDK for commit-level reviews.
When prompted for scope, choose user to load the plugin in every future session on this machine, or project to restrict it to the current directory. For teams, project scope enforced via .claude/settings.json is the cleaner option:
{
"plugins": ["security-guidance@claude-plugins-official"]
}
Commit that file and the plugin becomes active for every team member working on the project. Organization admins can also push it through managed settings without requiring individual installs.
Configuring It for Your Codebase
The built-in patterns cover common vulnerability classes well. Two project-level config files let you go further.
Threat model context: Create .claude/claude-security-guidance.md and write plain-language instructions. For example: “This service processes payments. Flag any code that stores raw card numbers, passes user IDs from query parameters to database lookups without authorization checks, or uses symmetric encryption for PII.” The model-backed layers load this file as additional context on every review.
Custom pattern rules: Create .claude/security-patterns.yaml with your own regex or substring rules. The plugin supports up to 50 custom rules per project. Use security-patterns.json if PyYAML is not installed — the schema is identical.
Two environment variables control which models run the deeper reviews: SECURITY_REVIEW_MODEL for Layer 2 and SG_AGENTIC_MODEL for Layer 3. Both default to Opus 4.7. Pointing them at a lighter model reduces cost at the expense of review depth.
What the Numbers Say — and What the Plugin Does Not Do
Anthropic reports a 30–40% reduction in security-related PR comments after internal rollout. That is a meaningful improvement — not a benchmark number — from deployment in real production codebases.
Still, the plugin is not a substitute for a security audit. It catches known patterns and logic-level issues discoverable from code context. It does not assess threat models, architectural decisions, infrastructure configuration, or novel attack classes. Teams shipping security-sensitive products should treat it as a first filter, not a final gate. For a deeper automated pass at the CI level, Anthropic’s GitHub Action runs a complementary review on every push.
Layers 2 and 3 consume tokens from your existing Claude usage budget. Layer 1 is genuinely free. Layer 3 adds latency to commit operations — this is expected and generally worth the trade.
The security-guidance plugin is the most practical thing Anthropic has shipped for developer security since Claude Code launched. The gap between “AI writes your code” and “AI checks your code for vulnerabilities” has been obvious. It is now closed, at least for the patterns that matter most.













