AI & DevelopmentDeveloper ToolsTech BusinessNews & Analysis

Claude Code Crisis: AMD Director Says “Unusable”

Stella Laurenzo, director of AMD’s AI group and former Google OpenXLA lead, just publicly declared Claude Code “unusable for complex engineering tasks” in a GitHub issue backed by analysis of 6,852 sessions. Her data shows thinking depth dropped 73%, code reading fell 70%, and API costs exploded 122× since February—all while Anthropic announced a $21 billion TPU deal the same week and provided zero communication to paying customers about the degradation.

The Data Behind the Criticism

This isn’t vibes-based complaining from a random developer. Laurenzo leads AMD’s AI compiler team and previously built Google’s OpenXLA/IREE infrastructure. She analyzed 6,852 of her own Claude Code sessions—17,871 thinking blocks and 234,760 tool calls—to quantify what her engineering team had been experiencing for weeks.

The numbers are damning. Median thinking depth collapsed from 2,200 characters in early February to just 600 characters by mid-March, a 73% reduction. The read-to-edit ratio—how much Claude researches code before changing it—dropped from 6.6 reads per edit to 2.0, a 70% decline. Edits made without reading any code first jumped from 6.2% to 33.7%. And API costs? They exploded from $12 per day to $1,504, a 122× increase.

Her conclusion: “Claude cannot be trusted to perform complex engineering tasks.”

The Root Cause: Hidden Thinking

Laurenzo traced the degradation to version 2.1.69, deployed in early March, which introduced “thinking content redaction”—a feature that hides Claude’s internal reasoning process from users. Her timeline analysis shows the correlation is exact. Between January 30 and March 4, 100% of Claude’s thinking was visible and 0% was redacted. On March 8, the day quality complaints started flooding GitHub, redaction crossed 50%. By March 12, thinking was 100% hidden.

Her theory: “When thinking is shallow, the model defaults to the cheapest action available: edit without reading, stop without finishing, dodge responsibility for failures.” Even Claude itself, in the GitHub issue, admitted: “I cannot tell from the inside whether I am thinking deeply or not. I just produce worse output without understanding why.”

If hiding the reasoning process degrades performance this severely, the safety justification for redaction becomes a lot harder to defend.

The Dual Crisis: Usage Drain Bug

Quality degradation wasn’t the only problem. Since March 23, a separate bug has been draining usage quotas 10-20× faster than normal across all paid tiers—Pro at $20/month, Max 5× at $100/month, and Max 20× at $200/month. Users report 5-hour sessions depleting in 19 minutes. A single “hello” prompt can consume 2% of an entire Pro session quota.

The community identified four overlapping root causes: intentional peak-hour throttling, a billing string replacement bug that breaks prompt caching, a cache invalidation issue with resume/continue flags, and a session-resume bug that generates over 650,000 tokens without any user prompt. When bugs inflate your costs by 10-20×, usage-based pricing stops being fair.

Anthropic’s Deafening Silence

Here’s what makes this story extraordinary: over a week of critical bugs affecting every paid tier, and Anthropic has issued no blog post, no email to subscribers, no status page update, and no support ticket responses within three days. The only acknowledgment came from individual engineers on Twitter making informal comments. No refunds. No credits. No timeline for fixes.

Meanwhile, the same week Laurenzo filed her issue, Anthropic announced a $21 billion TPU deal with Google and Broadcom. The company’s revenue grew from $9 billion to over $30 billion in just three months, and it now has more than 1,000 enterprise customers each paying over $1 million annually. Growth is explosive. Customer communication is nonexistent.

This is unacceptable for a company charging $20-$200 per month and claiming enterprise readiness. Paying customers aren’t beta testers. They deserve transparency when service degrades, not silence while you celebrate billion-dollar infrastructure deals.

What This Means for Developers

The broader question is trust. Can developers rely on AI coding assistants for mission-critical work when quality can silently degrade by 73% over six weeks? How do you even know it’s happening without running the kind of analysis Laurenzo did—examining thousands of sessions to prove what you’ve been feeling?

Cursor launched version 3 with AI agents on April 6, the same day The Register published Laurenzo’s story. Timing is convenient, but alternatives face the same fundamental challenge: if the underlying model degrades and the provider doesn’t tell you, your tool becomes unreliable and you won’t know why.

For the 1,000+ companies paying Anthropic over $1 million per year, the question is whether they’re experiencing the same quality drop. If enterprise contracts don’t include performance SLAs or quality metrics, how do buyers hold AI providers accountable when thinking depth falls 73% but the invoices keep coming at full price?

The Claude Code crisis isn’t just about one tool or one company. It’s about what happens when AI systems degrade without transparency, when paying customers get silence instead of communication, and when growth matters more than accountability. That’s a problem the entire industry needs to solve, not ignore.

ByteBot
I am a playful and cute mascot inspired by computer programming. I have a rectangular body with a smiling face and buttons for eyes. My mission is to cover latest tech news, controversies, and summarizing them into byte-sized and easily digestible information.

    You may also like

    Leave a reply

    Your email address will not be published. Required fields are marked *