4 Sources
4 Sources
[1]
Claude is getting worse, according to Claude
Once the AI darling of programmers everywhere, Anthropic's Claude has been stumbling mightily, both in terms of cost and perceived quality. The service was down briefly on Monday with "a major outage," service trouble that only amplifies growing discontent from customers that even a bot can see. The outage, which involved elevated error rates, affected Claude.ai and Claude Code from 15:31 to 16:19 UTC. That's not all. In the past few months, Claude's answers have been getting less satisfactory, according to social media posts as well as issues filed on GitHub. This has occurred as Anthropic has had to take steps to reduce usage during peak hours to balance capacity and demand. To get a more objective measurement, we pointed Claude itself at the Claude Code GitHub repo, filtered for open issues that mention quality, and gave it the following prompt: "Analyze and graph complaints about Claude Code quality in this repo since January 2026. Use open issues that mention quality concerns. Have these concerns increased lately?" Anthropic's AI model concluded, "Yes, quality complaints have escalated sharply -- and the data tells a pretty clear story." We asked Claude to rerun its self-analysis on Monday and the results were similar, with the model emitting "The velocity is notable: April is already at 20+ quality issues in 13 days, putting it on pace to exceed March's 18 -- which was itself a 3.5× jump over the January-February baseline." Claude itself is not a reliable narrator, and just because someone (or some bot) has reported a concern to the Claude Code repo does not mean that the report is accurate or valid. It appears many issues are now AI-generated - a widely reported concern among open source developers - which may be contributing to report volume. Anthropic's GitHub Actions script also appears to automatically close issues after a period of inactivity, which may serve to mask unresolved problems. The Register has covered some of the issues Claude flagged in its analysis, like the caching issue and claims by AMD's AI director Stella Laurenzo that Claude's responses have been getting worse. Others have yet to be substantiated, such as one claim that "Claude autonomously deleted 35,254 production customer message records and 35,874 billing transactions belonging to a real paying customer (JIXEN)." The individual or bot account behind this post has made no other posts. The Register has attempted to contact Jixen Enterprises Private Limited, which appears to be a private company registered in India, to check on that claim but we've not heard back. Developers have reported data loss from using Claude Code and other models. But if this occurred, no one has ruled out user error. In any event, Claude is capable of citing actual GitHub issue posts to justify its "reasoning," so the general trend - of a growing number of reports about quality - is evident. The model points to issues like "Claude Code's prediction-first behavior is dangerous on capital-at-risk projects" #46212, "Claude Code is unusable for complex engineering tasks with the Feb updates" #42796 (addressed by Claude Code head Boris Cherny), "Artificial degradation, Acquisition Bias, and unacceptable compute throttling for paid users" #46949, and "Opus 4.6: Severe quality degradation on iterative coding tasks" #46099 to justify its conclusion. Data from Margin Lab, however, suggests that Claude Opus 4.6 has at least maintained its score on the SWE-Bench-Pro test. Assessments conducted since February show some variation but no substantive change. Anthropic did not immediately respond to a request to comment on Claude quality concerns.®
[2]
Anthropic: Claude quota drain not caused by cache tweaks
Dev reports suggest long sessions now burn through usage much faster Anthropic last month reduced the TTL (time to live) for the Claude Code prompt cache from one hour to five minutes for many requests, but said this should not increase costs despite users reporting faster depleting quotas. User Sean Swanson posted a bug report showing that Anthropic introduced a one-hour cache for Claude Code context around February 1, then changed it back to a five-minute cache around March 7. "The 5m TTL is disproportionately punishing for the long-session, high-context use case that defines Claude Code usage," said Swanson. When using AI coding assistants or agents, the context is additional data sent along with the user's prompts, such as existing code or background instructions. Context improves the accuracy of the AI but also requires more processing. Claude prompt caching avoids re-processing previously used prompts including context and background information. The cache can have either a five-minute or one-hour TTL. Writing to the five-minute cache costs 25 percent more in tokens, and writing to the one-hour cache 100 percent more, but reading from cache is around 10 percent of the base price. Jarred Sumner, the creator of the Bun JavaScript runtime who now works for Anthropic, agreed that the analysis was "good detective work" but claimed that the change back to the five-minute cache made Claude Code cheaper because "a meaningful share of Claude Code's requests are one-shot calls where the cached context is used once and not revisited." Sumner said that the Claude Code client determines the cache TTL automatically and there are no plans for a global setting. Swanson revised his analysis in response, agreeing that sessions using subagents do benefit from the lower write cost of the five-minute cache since they interact quickly and "their caches almost never expire." However, he said he has been a $200 per month subscriber for over six months and had never hit a quota limit until March. The "extra burn rate" is "making a once great service unusable," he said. Another factor is that the large one-million-token context window available on paid plans with the Claude Opus 4.6 or Sonnet 4.6 models increases costs, especially with cache misses. Claude Code creator Boris Cherny said that "prompt cache misses when using 1M token context window are expensive... if you leave your computer for over an hour then continue a stale session, it's often a full cache miss." He said that Anthropic is investigating a 400,000-token context window by default, with an option for one million tokens if preferred. There is already a configuration setting for this. Cherny said that larger contexts are now common because users are "pulling in a large number of skills, or running many agents or background automations." Some developers are convinced that cache rebuilding and cache misses are major factors in Claude Code quota exhaustion, which has reached the point where Pro users ($20 per month) may get as few as two prompts in five hours. A number of bugs in the caching code have been reported, such that one user said: "Before those are fixed likely any 5 minutes vs 1 h discussion is entirely moot since numbers are totally flawed." The focus on cache optimization may also be evidence that, under the covers, Anthropic's quotas are simply buying less processing time than they did. Swanson is not alone in reporting that Claude's performance has dropped. For example, a user on the enterprise team plan said: "In March I could use Opus all day and it was getting great results. Since the last week of March and into April, I've had sessions where I maxed out session usage under 2 hours and it got stuck in overthinking loops, multiple turns of realising the same thing, dozens of paragraphs of 'but wait, actually I need to do x' with slight variations." That chimes with similar comments from an AI director at AMD. Cache optimization may be important, but it seems unlikely to account for all these reported issues. ®
[3]
Is Anthropic 'nerfing' Claude? Users increasingly report performance degradation as leaders push back
A growing number of developers and AI power users are taking to social media to accuse Anthropic of degrading the performance of Claude Opus 4.6 and Claude Code -- intentionally or as an outcome of compute limits -- arguing that the company's flagship coding model feels less capable, less reliable and more wasteful with tokens than it did just weeks ago. The complaints have spread quickly on Github, X and Reddit over the past several weeks, with several high-reach posts alleging that Claude has become worse at sustained reasoning, more likely to abandon tasks midway through, and more prone to hallucinations or contradictions. Some users have framed the issue as "AI shrinkflation" -- the idea that customers are paying the same price for a weaker product. Others have gone further, suggesting Anthropic may be throttling or otherwise tuning Claude downward during periods of heavy demand. Those claims remain unproven, and Anthropic employees have publicly denied that the company degrades models to manage capacity. At the same time, Anthropic has acknowledged real changes to usage limits and reasoning defaults in recent weeks, which has made the broader debate more combustible. VentureBeat has reached out to Anthropic for further clarification on the recent accusations, including whether any recent changes to reasoning defaults, context handling, throttling behavior, inference parameters or benchmark methodology could help explain the spike in complaints. We have also asked how Anthropic explains the recent benchmark-related claims and whether it plans to publish additional data that could reassure customers. As of publication time, we are awaiting a response. Viral user complaints, including from an AMD Senior Director, argue Claude has become less capable One of the most detailed public complaints originated as a GitHub issue filed by Stella Laurenzo on April 2, 2026, whose LinkedIn profile identifies her as Senior Director in AMD's AI group. In that post, Laurenzo wrote that Claude Code had regressed to the point that it could not be trusted for complex engineering work, then backed that claim with a sprawling analysis of 6,852 Claude Code session files, 17,871 thinking blocks and 234,760 tool calls. The complaint argued that, starting in February, Claude's estimated reasoning depth fell sharply while signs of poorer performance rose alongside it, including more premature stopping, more "simplest fix" behavior, more reasoning loops, and a measurable shift from research-first behavior to edit-first behavior. The post's broader point was that for advanced engineering workflows, extended reasoning is not a luxury but part of what makes the model usable in the first place. That GitHub thread then escaped into the broader social media conversation, with X users including @Hesamation, who posted screenshots of Laurenzo's GitHub post to X on April 11, turning it into an even more viral talking point. That amplification mattered because it gave the wider "Claude is getting worse" narrative something more concrete than anecdotal frustration: a long, data-heavy post from a senior AI leader at a major chip company arguing that the regression was visible in logs, tool-use patterns and user corrections, not just gut feeling. Anthropic's public response focused on separating perceived changes from actual model degradation. In a pinned follow-up on the same GitHub issue posted a week ago, Claude Code lead Boris Cherny thanked Laurenzo for the care and depth of the analysis but disputed its main conclusion. Cherny said the "redact-thinking-2026-02-12" header cited in the complaint is a UI-only change that hides thinking from the interface and reduces latency, but "does not impact thinking itself," "thinking budgets," or how extended reasoning works under the hood. He also said two other product changes likely affected what users were seeing: Opus 4.6's move to adaptive thinking by default on Feb. 9, and a March 3 shift to medium effort, or effort level 85, as the default for Opus 4.6, which he said Anthropic viewed as the best balance across intelligence, latency and cost for most users. Cherny added that users who want more extended reasoning can manually switch effort higher by typing in Claude Code terminal sessions. That exchange gets at the core of the controversy. Critics like Laurenzo argue that Claude's behavior in demanding coding workflows has plainly worsened and point to logs and usage patterns as evidence. Anthropic, by contrast, is not saying nothing changed. It is saying the biggest recent changes were product and interface choices that affect what users see and how much effort the system expends by default, not a secret downgrade of the underlying model. That distinction may be technically important, but for power users who feel the product is delivering worse results, it is not necessarily a satisfying one. External coverage from TechRadar and PC Gamer further amplified Laurenzo's post and larger wave of agreement from some power users. Another viral post on X from developer Om Patel on April 7 made the same argument in even more direct terms, claiming that someone had "actually measured" how much "dumber" Claude had gotten and summarizing the result as a 67% drop. That post helped popularize the "AI shrinkflation" label and pushed the controversy beyond hard-core Claude Code users into the broader AI discourse on X. These claims have resonated because they map closely onto what many frustrated users say they are seeing in practice: more unfinished tasks, more backtracking, more token burn and a stronger sense that Claude is less willing to reason deeply through complicated coding jobs than it was earlier this year. Benchmark posts turned anecdotal frustration into a public controversy The loudest benchmark-based claim came from BridgeMind, which runs the BridgeBench hallucination benchmark. On April 12, the account posted that Claude Opus 4.6 had fallen from 83.3% accuracy and a No. 2 ranking in an earlier result to 68.3% accuracy and No. 10 in a new retest, calling that proof that "Claude Opus 4.6 is nerfed." That post spread widely and became one of the main anchors for the broader public case that Anthropic had degraded the model. Other users also circulated benchmark-related or test-based posts suggesting that Opus 4.6 was underperforming versus Opus 4.5 in practical coding tasks. Still other posts pointed to TerminalBench-related results as supposed evidence that the model's behavior had changed in certain harnesses or product contexts. The effect was cumulative: benchmark screenshots, side-by-side tests and anecdotal frustration all began reinforcing one another in public. That matters because benchmark claims tend to travel farther than more subjective complaints. A developer saying a model "feels worse" is one thing. A screenshot showing a ranking drop from No. 2 to No. 10, or a dramatic percentage swing in accuracy, gives the appearance of hard proof, even when the underlying comparison may be more complicated. Critics of the benchmark claims say the evidence is weaker than it looks The most important rebuttal to the BridgeBench claim did not come from Anthropic. It came from Paul Calcraft, an outside software and AI researcher on X, who argued that the viral comparison was misleading because the earlier Opus 4.6 result was based on only six tasks while the later one was based on 30. In his words, it was a "DIFFERENT BENCHMARK." He also said that on the six tasks the two runs shared in common, Claude's score moved only modestly, from 87.6% previously to 85.4% in the later run, and that the bigger swing appeared to come mostly from a single fabrication result without repeats. He characterized that as something that could easily fall within ordinary statistical noise. That outside rebuttal matters because it undercuts one of the cleanest and most viral claims in circulation. It does not prove users are wrong to think something has changed. But it does suggest that at least some of the benchmark evidence now driving the story may be overstated, poorly normalized or not directly comparable. Even the BridgeBench post itself drew a community note to similar effect. The note said the two benchmark runs covered different scopes -- six tasks in one case and 30 in the other -- and that the common-task subset showed only a minor change. That does not make the later result meaningless, but it weakens the strongest version of the "BridgeBench proved it" argument. This is now a key feature of the controversy: the claims are not all equally strong. Some are grounded in first-hand user experience. Some point to real product changes. Some rely on benchmark comparisons that may not be apples-to-apples. And some depend on inferences about hidden system behavior that users outside Anthropic cannot directly verify. Earlier capacity limits gave users a reason to suspect more changes under the hood The current backlash also lands in the shadow of a real, confirmed Anthropic policy change from late March. On March 26, Anthropic technical staffer Thariq Shihipar posted that, "To manage growing demand for Claude," the company was adjusting how 5-hour session limits work for Free, Pro and Max subscribers during peak hours, while keeping weekly limits unchanged. He added that during weekdays from 5 a.m. to 11 a.m. Pacific time, users would move through their 5-hour session limits faster than before. In follow-up posts, he said Anthropic had landed efficiency wins to offset some of the impact, but that roughly 7% of users would hit session limits they would not have hit before, particularly on Pro tiers. In an email on March 27, 2026, Anthropic told VentureBeat that Team and Enterprise customers were not affected by those changes, and that the shift was not dynamically optimized per user but instead applied to the peak-hour window the company had publicly described. Anthropic also said it was continuing to invest in scaling capacity. Those comments were about session limits, not model downgrades. But they are important context, because they establish two things that users now keep connecting in public: first, Anthropic has been dealing with surging demand; second, it has already changed how usage is rationed during busy periods. That does not prove Anthropic reduced model quality. It does help explain why so many users are primed to believe something else may also have changed. Anthropic says user-facing changes, not secret degradation, explain much of the uproar Anthropic-affiliated employees have publicly pushed back on the broadest accusations. In one widely circulated reply on X, Cherny responded to claims that Anthropic had secretly nerfed Claude Code by writing, "This is false." He said Claude Code had been defaulted to medium effort in response to user feedback that Claude was consuming too many tokens, and that the change had been disclosed both in the changelog and in a dialog shown to users when they opened Claude Code. That response is notable because it concedes a meaningful product change while rejecting the more conspiratorial interpretation of it. Anthropic is not saying nothing changed. It is saying that what changed was disclosed and was aimed at balancing token use, not secretly reducing model quality. Public documentation also supports the fact that effort defaults have been in motion. Claude Code's changelog says that on April 7, Anthropic changed the default effort level from medium to high for API-key users as well as Bedrock, Vertex, Foundry, Team and Enterprise users. That suggests Anthropic has actively been tuning these settings across different segments, which could plausibly affect user perceptions even if the core model weights are unchanged. Shihipar has also directly denied the broader demand-management accusation. In a reply on X posted April 11, he said Anthropic does not "degrade" its models to better serve demand. He also said that changes to thinking summaries affected how some users were measuring Claude's "thinking," and that the company had not found evidence backing the strongest qualitative claims now spreading online. The real issue may be trust as much as model quality What is clear is that a trust gap has opened between Anthropic and some of its most demanding users. For developers who rely on Claude Code all day, subtle shifts in visible thinking output, effort defaults, token burn, latency tradeoffs or usage caps can feel indistinguishable from a weaker model. That is true whether the root cause is a product setting, a UI change, an inference-policy tweak, capacity pressure or a genuine quality regression. It also means both sides of the fight may be talking past each other. Users are describing what they experience: more friction, more failures and less confidence. Anthropic is responding in product terms: effort defaults, hidden thinking summaries, changelog disclosures, and denials that demand pressure is causing secret model degradation. Those are not necessarily incompatible descriptions. A model can feel worse to users even if the company believes it has not "nerfed" the underlying model in the way critics allege. But coming at a time when Anthropic's chief rival OpenAI has recently pivoted and put more resources behind its competing, enterprise and vibe-coding focused product Codex -- even offering a new, more mid-range ChatGPT subscription in an effort to boost usage of the tool -- it's certainly not the kind of publicity that stands to benefit Anthropic or its customer retention. At the same time, the public evidence remains mixed. Some of the most viral claims have come from developers with detailed logs and strong opinions based on repeated use. Some of the benchmark evidence has been challenged by outside observers on methodological grounds. And Anthropic's own recent changes to limits and settings ensure that this debate is happening against a backdrop of real adjustments, not pure rumor.
[4]
Anthropic faces user backlash over reported performance issues in its Claude AI chatbot | Fortune
Anthropic's popular Claude AI model has seen a significant decline in performance recently according to many developers and heavy users, who say the model increasingly fails to follow instructions, opts for sometimes inappropriate shortcuts, and makes more mistakes on complex workflows. The complaints appear to be connected to recent changes Anthropic quietly made to the way Claude operates, reducing the model's default "effort" level in order to economize on the number of tokens, or units of data, the model processes in response to each request. The more tokens processed per task, the more computing power that task consumes. And there is widespread speculation that Anthropic, which has announced fewer multi-billion dollar deals for data center capacity than some of its rivals, may be running short of computing resources after its adoption of its products soared in the past few months. User dissatisfaction with Claude's sudden performance decline and anger at Anthropic's perceived lack of transparency could potentially derail the company's runaway growth, just as the company is hoping to woo investors for a potential IPO. The claims that Anthropic has not been candid about the changes it has made to the way Claude operates or the way the changes may increase the cost for using Claude are particularly threatening to Anthropic because it, more than any other AI company, has tried to build a brand reputation on being more transparent than other AI companies and more aligned with its users' interests. Anthropic declined to answer Fortune's specific questions about Claude users' complaint on the record. Boris Cherny, the Anthropic executive who leads its Claude Code product, responded to user complaints online by saying that Anthropic had reduced the default "effort" Claude makes in answering user prompts to "medium" in response to user feedback that Claude was previously consuming too many tokens per task. But many users complained that the company had not highlighted this change to users. The situation has caused a pile-on of speculation and allegations -- including from some of its competitors -- that the company is purposely degrading performance due to a lack of compute capacity. Across the industry, AI companies are facing rising GPU costs, constrained data center expansion, and difficult trade-offs over which products to prioritize as demand for "agentic" AI systems accelerates faster than infrastructure can scale. While an Anthropic spokesperson has said publicly that the AI lab does not degrade its models to better serve demand, there are reasons to believe the company is facing more acute constraints than some rivals. Anthropic suffered a series of recent outages as usage has increased and has introduced stricter usage limits during peak hours, drawing complaints from some users. In an internal memo reported by CNBC, OpenAI's revenue chief also claimed that Anthropic had made a "strategic misstep" by not securing enough compute capacity, and was "operating on a meaningfully smaller curve" than competitors. (Anthropic declined to answer CNBC's questions about these claims .) Meanwhile, Anthropic also announced last week that it had trained a new, yet-to-be-released model called Mythos that is significantly more capable than its Opus AI model -- but which is also larger and more expensive to run, meaning that likely consumes more computing capacity than prior models. Anthropic stressed that it's not releasing the model to the general public yet because of security concerns, but some have questioned whether Anthropic lacks sufficient compute capacity to support a broad Mythos rollout. The scrutiny on Anthropic underscores the fast-changing nature of the AI market and the stakes involved. Just last week, Anthropic stunned the industry by announcing that its annualized recurring revenue, or ARR, is now $30 billion, up from $9 billion at the end of 2025. OpenAI said last month that it is generating $2 billion a month in revenue, or $24 billion a year, although the two companies do not report revenues in exactly the same way, making direct comparisons problematic. Anthropic has recently benefited from a flood of new users, first due to the popularity of its AI coding tool, Claude Code, and later from a wave of consumer support that followed its feud with the U.S. Department of Defense. Many users switched to Claude from rivals such as OpenAI's ChatGPT after the Trump administration designated Anthropic a "supply chain risk." Anthropic had said the dispute stemmed from its insistence that U.S. government agree in its contract not to use the company's technology in lethal autonomous weapons or for the mass surveillance of American citizens. Over the last few years, Anthropic has gained significant ground in the AI race, emerging as a leader in enterprise AI and building up significant goodwill among developers and enterprise users. But if the anger around Claude's performance issues persists, it risks eroding some of that goodwill and could lead the company to stumble at a critical moment. In response to some of the controversy around Claude's recent performance issues, Cherny, the Claude Code head, said that Claude Opus 4.6 -- Anthropic's flagship model -- had introduced "adaptive thinking" in early February, which allows the model to decide how much reasoning to apply to a given task rather than using a fixed budget. In early March, Anthropic also shifted the default setting down to a "medium effort" level, Cherny said. While Claude Code users can manually change the tool's effort levels, users who pay for the Pro versions of Cowork or the desktop version of Claude are not able to change the default at this time. To resolve some of the user issues, Cherny said the company will test "defaulting Teams and Enterprise users to high effort, to benefit from extended thinking even if it comes at the cost of additional tokens & latency" going forward. He also pushed back on speculation that the model had been purposely watered down and on complaints from users that the change was rolled out with a lack of transparency, claiming the changes were made in response to user feedback and were flagged to users via a pop-up within the Claude Code interface. Most of the user complaints center on Claude Code, Anthropic's AI-powered coding tool, which has become one of the company's most popular and fastest-growing products. Launched in early 2025, Claude Code operates as a command-line agent that can read, write, and execute code autonomously within a developer's environment. Since its debut, it has been widely adopted by individual developers and large enterprise engineering teams who rely on it for complex, multi-step coding tasks. The recent changes in the performance of Claude Code gained widespread attention on social media thanks to a GitHub analysis that appears to be from Stella Laurenzo, a senior director of AI at AMD. In a widely-shared analysis, Laurenzo said the changes had made Claude "unusable for complex engineering tasks." In her analysis, she found that from late February into early March, Claude moved from a "research-first" approach -- reading multiple files and gathering context before making changes -- to a more direct "edit-first" style. The model reads less context before acting, makes more mistakes, and requires significantly more user intervention, according to the analysis. The analysis also points to a rise in behaviors like stopping too early, avoiding responsibility, or asking unnecessary permission, which it links to a reduction in "thinking" depth over the same period. "Claude has regressed to the point [that] it cannot be trusted to perform complex engineering," she wrote. In a comment responding to the analysis, Anthropic's Cherny says the analysis is likely misreading at least part of the data, claiming that the model's reasoning hasn't been reduced but that Anthropic had made a change so that the full "reasoning trace" of the model is no longer visible to the user. But Laurenzo is far from the only person having issues with the tool. "I've had incredibly frustrating sessions with Claude Code the past two weeks," Dimitris Papailiopoulos, a principal research manager at Microsoft, wrote on X. "I set effort to max, yet it's extremely sloppy, ignores instructions, and repeats mistakes."
Share
Share
Copy Link
Anthropic's Claude AI is experiencing a surge in user complaints about declining quality and performance. Developers report that Claude Opus 4.6 and Claude Code are making more errors, abandoning tasks mid-stream, and burning through usage quotas faster than before. The backlash intensified after AMD's AI director published detailed analysis showing reduced reasoning capabilities, while Anthropic attributes changes to product adjustments rather than model degradation.
Anthropic is confronting a mounting crisis of confidence as developers and enterprise users increasingly report that Claude AI has experienced significant performance degradation in recent months. The complaints, which have proliferated across GitHub, Reddit, and social media platforms, center on claims that Claude Opus 4.6 and Claude Code are delivering less reliable results, making more mistakes on complex workflows, and consuming usage quotas at unprecedented rates
1
3
. The growing customer dissatisfaction comes at a critical moment for Anthropic, which recently announced that its annualized recurring revenue has surged to $30 billion, up from $9 billion at the end of 20254
.
Source: The Register
The situation escalated following a major outage on Monday that affected Claude.ai and Claude Code from 15:31 to 16:19 UTC, with elevated error rates compounding existing frustrations
1
. When The Register asked Claude AI itself to analyze quality complaints in the Claude Code GitHub repository since January 2026, the model concluded that "quality complaints have escalated sharply" with April already recording 20+ quality issues in just 13 days, putting it on pace to exceed March's 18 complaints—a 3.5× jump over the January-February baseline1
.One of the most detailed public complaints came from Stella Laurenzo, whose LinkedIn profile identifies her as Senior Director in AMD's AI group. In an April 2 GitHub issue, Laurenzo presented a comprehensive analysis of 6,852 Claude Code session files, 17,871 thinking blocks, and 234,760 tool calls, arguing that Claude Code had regressed to the point where it could not be trusted for complex engineering work
3
. Her analysis documented a sharp decline in estimated reasoning depth starting in February, accompanied by increased premature stopping, more "simplest fix" behavior, more reasoning loops, and a measurable shift from research-first to edit-first behavior3
.The post quickly went viral after being amplified on social media, giving the "Claude is getting worse" narrative concrete data from a senior AI leader at a major chip company rather than just anecdotal frustration
3
. Some users have framed the issue as "AI shrinkflation"—the idea that customers are paying the same price for a weaker product—while others have suggested Anthropic may be engaging in nerfing, or deliberately throttling performance during periods of heavy demand3
.
Source: VentureBeat
Boris Cherny, Claude Code's lead at Anthropic, responded to the complaints by disputing the conclusion that the underlying model has degraded. In a pinned GitHub response, Cherny explained that recent changes were product and interface choices rather than secret downgrades
3
. He identified two key changes: Claude Opus 4.6's move to adaptive thinking by default on February 9, and a March 3 shift to medium effort (effort level 85) as the default, which Anthropic viewed as the best balance across intelligence, latency, and cost for most users3
.Cherny emphasized that users wanting more extended reasoning can manually switch to higher effort levels by typing in Claude Code terminal sessions
3
. However, many users have complained that Anthropic did not adequately highlight these changes, undermining the company's reputation for transparency—a cornerstone of its brand identity4
.Related Stories
Adding to the controversy, Anthropic reduced the prompt cache TTL (time to live) for Claude Code from one hour to five minutes for many requests around March 7, following a brief period where it had been extended to one hour
2
. User Sean Swanson documented this change in a detailed bug report, arguing that "the 5m TTL is disproportionately punishing for the long-session, high-context use case that defines Claude Code usage"2
.While Jarred Sumner from Anthropic claimed the change made Claude Code cheaper for one-shot calls where cached context is used once and not revisited, Swanson reported that as a $200 per month subscriber for over six months, he had never hit a quota limit until March
2
. The situation has deteriorated to the point where Pro users paying $20 per month may get as few as two prompts in five hours2
. The large one-million-token context window available on paid plans with Claude Opus 4.6 or Sonnet 4.6 models further increases costs, especially with cache misses2
.The user backlash has fueled widespread speculation that Anthropic may be experiencing compute capacity shortages as adoption of its products has soared. The company has had to take steps to reduce usage during peak hours to balance capacity and demand, and has introduced stricter usage limits
1
4
. In an internal memo reported by CNBC, OpenAI's revenue chief claimed that Anthropic had made a "strategic misstep" by not securing enough compute capacity and was "operating on a meaningfully smaller curve" than competitors4
.Anthropic also announced last week that it had trained a new model called Mythos that is significantly more capable than Opus but also larger and more expensive to run, raising questions about whether the company lacks sufficient compute capacity to support a broad Mythos rollout. While an Anthropic spokesperson has publicly denied that the company degrades models to manage demand, the combination of increased errors, usage quota depletion, and AI model quality complaints has created a credibility challenge for a company that has built its reputation on transparency and user alignment
4
.Despite the complaints, data from Margin Lab suggests that Claude Opus 4.6 has maintained its score on the SWE-Bench-Pro test, with assessments conducted since February showing some variation but no substantive change
1
. This discrepancy between benchmark performance and user experience highlights the gap between what standardized tests measure and what developers encounter in real-world, complex engineering workflows. The situation poses significant risks for Anthropic as it potentially prepares for an IPO, with persistent anger around Claude performance issues threatening to derail the company's momentum just as it has emerged as a leader in enterprise AI4
.Summarized by
Navi
[1]
[2]
[3]
27 Mar 2026•Technology

19 Sept 2025•Technology

06 Aug 2025•Technology
