Debate over Claude Opus 4.7: Concerns over Performance and ‘AI Shrinkflation’

Anthropic recently released Claude Opus 4.7, which was initially promoted with promises of outperforming its predecessor in handling complex reasoning and nuanced analysis. The company promoted the model’s ability to manage “complex, long-running tasks with rigor and consistency,” including self-verification mechanisms.

However, user feedback has raised significant concerns regarding the model’s actual performance. One prominent example detailed by a Reddit user involved asking Claude 4.7 to work through a detailed proof, which allegedly resulted in five instances of the model self-contradicting its own answers, stating, “oh wait, that doesn’t work, let me try again.”

Users expressed frustration that such self-second-guessing was unnecessary for a top-tier model for which they were paying a monthly fee, suggesting the model failed to achieve the coherence expected from an earlier version. Another user, Upali R., who used the model to develop a personal MicroSaaS productivity application, noted that the AI was less effective at complex, prolonged development tasks. He described the difficulty by comparing the failure to a “lesson in the ceiling,” indicating that the model quickly went off course.

The general sentiment among users is that Anthropic has restricted or pared back the edges of the model’s functionality. Many believe the models are more cautious and less intelligent, possibly due to safeguards implemented in the name of safety or alignment. This has led some commentators to label the trend “AI shrinkflation,” suggesting that later iterations deliver less form and function than their predecessors.

Analysts suggest that this perceived decline may stem from Anthropic limiting the amount of “effort,” or reasoning tokens, available on a query, a restriction that users have begun to notice. Experts also point to high user expectations and the challenge of context rot as contributing factors to the perceived decline.

These developments have prompted comparisons with competitors. While Claude and Opus maintain a lead in real-world usage, distrust in Anthropic’s future models, such as Mythos, has been mentioned. Meanwhile, OpenAI has been capitalizing on the market by releasing updates to its Codex app, enhancing developer workflows with features like reviewing pull requests and connecting to remote devboxes via SSH.