Claude Opus 4.5 Breaks 80% on SWE-Bench: First AI to Hit Human-Level Coding Milestone
Anthropic’s Claude Opus 4.5 became the first AI model to break 80% on SWE-bench Verified, scoring 80.9% and outperforming GPT-5.1 (77.9%) and Gemini ...
AI coding tools, LLMs, agents, and AI-assisted development