4 Sources
4 Sources
[1]
Anthropic Says Its Newest AI Model Is Getting Pretty Good at Using a Computer
Expertise Artificial intelligence, home energy, heating and cooling, home technology. The best Claude AI model you can get without paying for a subscription is getting a significant upgrade, Anthropic said Tuesday. The company released Claude Sonnet 4.6, a new version of its midrange model that it said can code about as well as a previous version of the bigger Opus. One particular improvement Anthropic highlighted about Sonnet 4.6 is its ability to use a computer the way you might, filling out forms and switching between browser tabs. In the OSWorld benchmark, which evaluates how well an AI can use an operating system, Sonnet 4.6 has shown it can operate a computer at a human baseline level, Anthropic said. That means it doesn't necessarily need specific software connectors or tools to do things like follow a spreadsheet or browse the internet. As AI models become more capable of doing things on our behalf rather than just giving us answers, the security risks increase. A big hazard is called prompt injection: Think of it as a website hiding a command somewhere that humans won't notice, but an AI will. (It's one of the major risks dogging the viral AI agent OpenClaw.) Anthropic said in its tests, Sonnet 4.6 showed significant improvement compared to Sonnet 4.5 in resisting prompt injection attacks. It was similar to Opus 4.6, released two weeks ago and only available for paid subscribers. As a coding model, Sonnet 4.6 can better follow detailed instructions, Anthropic said. The company is beta testing a context window of 1 million tokens for the model, which means you can give the AI massive amounts of information in a single request. Read more: I Vibe Coded an App With 3 Popular Chatbots. The Real Winner Is a Good Prompt Claude has seen a surge in popularity in recent months, with the Claude Code app experiencing a viral moment over the holidays as people discovered its vibe coding capabilities. Anthropic launched a Super Bowl ad campaign attacking rival OpenAI for its decision to put ads in its free and low-cost ChatGPT plans. At the same time, OpenAI's own Codex tool and latest model, GPT-5.3-codex, has emerged in recent weeks as a capable rival of Claude Code.
[2]
Claude Sonnet 4.6 delivers frontier-level AI for free and cheap-seat users
Also: Anthropic says its new Claude Opus 4.6 can nail your work deliverables on the first try This new Sonnet 4.6 model, available now, shows improved coding performance, better computer use skills, upgraded long-context reasoning, better agent planning, and improvements to knowledge work and design. As with Opus 4.6, Sonnet 4.6 now includes a 1 million-token context window (in beta). This allows for much longer and more complex work sessions without requiring a session reset or compaction. Sonnet 4.6 is now the default model for free and Pro tier users across the various Claude interfaces. Pricing for those plans (as well as for Sonnet API use) has not increased. Anthropic provides two branded AI models at different price points, Sonnet and Opus. Opus has always been the Cadillac of AI models, available at higher tiers and increased per-token API call pricing. Sonnet has been more of an entry-level model, still quite capable, but with substantially lower resource usage, enabling Anthropic to deploy it to free users and keep its token price down. According to the company's blog post announcing the release of Sonnet 4.6, "It approaches Opus-level intelligence at a price point that makes it more practical for far more tasks." Also: I used Claude Code to vibe code a Mac app in 8 hours, but it was more work than magic According to the company's testing, performance that previously would have only been seen in an Opus-class model is now available for users of Sonnet 4.6. This new model also shows major improvements in AI-based desktop computer interaction. There are some practical limits, however. The company says, "The model certainly still lags behind the most skilled humans at using computers. But the rate of progress is remarkable nonetheless. It means that computer use is much more useful for a range of work tasks, and that substantially more capable models are within reach." In early user testing, Anthropic found that developers preferred Sonnet 4.6 over Sonnet 4.5 about 70% of the time. The company says, "Users reported that it more effectively read the context before modifying code and consolidated shared logic rather than duplicating it. This made it less frustrating to use over long sessions than earlier models." I am curious about that remaining 30% though. You'd think with an apples-to-apples upgrade like Sonnet 4.5 to 4.6 that nearly all users would prefer the newer model. I've asked Anthropic why the remaining 30% presumably didn't favor the new release. Stay tuned. If I learn anything, I'll share it here. Also: Claude Code made an astonishing $1B in 6 months - and my own AI-coded iPhone app shows why When comparing Sonnet 4.6 to Opus 4.5 (the older frontier model released in November), developers preferred Sonnet 4.6 roughly 60% of the time. The company reported that early users, "Rated Sonnet 4.6 as significantly less prone to overengineering and laziness, and meaningfully better at instruction following. [Early users] reported fewer false claims of success, fewer hallucinations, and more consistent follow-through on multi-step tasks." Given that the current general-availability version of Opus is 4.6, this result isn't a harbinger of a mass migration off of the Opus model by higher-tier users. But what it does say is that the "cheap seats" model has improved enough to be up to tasks previously reserved for higher-performing models. Let's not underestimate the benefits of the higher performance, yet lower resource usage, that Sonnet 4.6 shows. When using the free and Pro tiers, Anthropic will throttle usage based on token use and resource usage. Sonnet 4.6's improvements are akin to a car getting more miles per gallon when using a new gasoline, especially if the "pickup 'n go" is still as good or better. Also: 10 things I wish I knew before trusting Claude Code to build my iPhone app The four-times-larger 1-million-token window also provides a practical benefit. It can hold entire codebases, lengthy contracts, or dozens of research papers. Anthropic says, "More importantly, Sonnet 4.6 reasons effectively across all that context. This can make it much better at long-horizon planning." Don't give up on Opus, however. Opus 4.6 is still Anthropic's frontier model champion. Also: I stopped using ChatGPT for everything: These AI models beat it at research, coding, and more The company says, "We find that Opus 4.6 remains the strongest option for tasks that demand the deepest reasoning, such as codebase refactoring, coordinating multiple agents in a workflow, and problems where getting it just right is paramount." Anthropic is positioning Sonnet 4.6 as a practical daily driver. In many cases, it's considerably faster than Opus 4.6. In that way, there are clear competitive parallels between OpenAI's GPT-5.3-Codex-Spark and its GPT-5.3-Codex, with Spark the faster and less accurate version and the full Codex the frontier model leading development. One big difference is that while Anthropic says Sonnet 4.6 is faster, it's not making anything like the 15x performance claim that OpenAI made of its Spark model. Also: Which AI tools are actually worth paying for? I'm keeping these subscriptions in 2026 - here's why (Disclosure: Ziff Davis, ZDNET's parent company, filed an April 2025 lawsuit against OpenAI, alleging it infringed Ziff Davis copyrights in training and operating its AI systems.) For most coding and knowledge work, Sonnet 4.6 offers strong performance, particularly for those on the lower pricing tiers. It also offers a solid price/performance profile for users working with API calls who want to get as much bang for the buck as possible. Meanwhile, Opus 4.6 remains a viable escalation path for more complex problems needing deeper reasoning. What about you? Have you tried Claude Sonnet 4.6 yet? If so, how does it compare to Opus in your real-world workflows? Does the 1-million-token context window change how you approach coding, research, or long planning sessions? Are you comfortable relying on the "cheap seats" model for serious work, or do you still escalate to Opus for high-stakes tasks? And if you're on the free or Pro tier, do these improvements make you more likely to stick with Sonnet as your daily driver? Let us know in the comments below.
[3]
Claude just upgraded its AI -- and it can now process entire projects at once
Claude Sonnet 4.6 brings near-flagship AI power to everyday work Just as we're all discovering the advanced performance of Claude Opus 4.6, Anthropic is unveiling Claude Sonnet 4.6 -- and it's a bigger upgrade than the version number suggests. The company says this is its most capable Sonnet model yet, bringing major improvements across coding, reasoning, computer use and design -- all while keeping the same pricing as Sonnet 4.5. More importantly, Sonnet 4.6 closes the gap between mid-tier and flagship AI models. Tasks that previously required an Opus-class model can now run on Sonnet -- at a fraction of the cost. Here's what you need to know. Claude Sonnet 4.6 is now the default model for Free and Pro users in Claude.ai and Claude Cowork. Early developer testing shows strong preference for the new model, with users favoring Sonnet 4.6 over its predecessor roughly 70% of the time. In some workflows, testers even preferred it to Claude Opus 4.5 thanks to better instruction-following and fewer hallucinations. One of the biggest upgrades is a 1M-token context window (currently in beta). That's enough to analyze: But size isn't the whole story. Sonnet 4.6 is designed to reason across that context, enabling better long-horizon planning and multi-step problem solving all while prioritizing safety. In one evaluation simulating business operations, the model invested aggressively early, then pivoted toward profitability -- a strategic shift that helped it outperform competitors. Anthropic continues pushing toward AI that can operate software the way humans do. Instead of relying on APIs, Claude can: Benchmarks from OSWorld -- which tests AI using real software like Chrome, LibreOffice and VS Code -- show steady improvement, with early users reporting human-level performance on complex workflows. This matters because most business software wasn't built for automation. A model that can use tools the way people do could dramatically expand what AI can actually accomplish. That said, the company acknowledges the technology still trails expert human users, but progress is accelerating. As AI gains the ability to operate computers, security risks rise. One of the biggest threats is prompt injection, where malicious instructions are hidden inside websites or documents. Anthropic says Sonnet 4.6 shows a major improvement in resisting these attacks compared to Sonnet 4.5, performing similarly to its latest Opus model in safety evaluations. Sonnet 4.6 introduces several platform improvements: For most real-world productivity work, Sonnet 4.6 now offers near-flagship performance at a significantly lower cost. Anthropic says Opus 4.6 remains the best option for complex codebase refactoring, multi-agent coordination and high-precision reasoning tasks. Claude Sonnet 4.6 is available now across Claude.ai, Claude Cowork, Claude Code, the Claude API and major cloud platforms. The free tier has also been upgraded and now includes file creation, connectors and context compaction. With massive context handling and improved computer-use capabilities, this release pushes AI closer to being a true digital coworker rather than a typical chatbot. Check back here for real-world tests to see what this new model can do.
[4]
Anthropic debuts Sonnet 4.6, a highly capable creative and coding AI model - SiliconANGLE
Anthropic debuts Sonnet 4.6, a highly capable creative and coding AI model Anthropic PBC upgraded the next update to its Claude Sonnet model today to version 4.6, designed for business and creative professional capabilities, bringing increased skills in computer use, long-context reasoning, agent planning, knowledge work and design. The company said it brings the coveted 1 million token context window, recently introduced in the company's flagship Opus 4.6 model, to Sonnet in beta test mode. For users on Free and Pro plans, Sonnet 4.6 is now the default, and pricing remains the same at $3/$15 per million input and output tokens, respectively. According to Anthropic, the performance of the new model now approaches previous Opus-level capabilities, showing major improvement in computer use skills compared to prior Sonnet models. Sonnet sits in the mid-level of performance-to-cost optimization. Anthropic is particularly focused on the model's capability to automate computer user interfaces. In October 2024, it introduced a general-purpose, computer-user model. Since then, the company has developed functional capabilities that have become a built-in model capable of taking control of Chrome, working with LibreOffice, VS Code and more. The company said that since the first computer-use demonstration, Sonnet models have made steady gains. Users have seen human-level capabilities in tasks such as navigating complex spreadsheets or filling out multistep web forms, before pulling information together across multiple browser tabs. However, Anthropic said the model still lags behind most skilled human reasoning at using computers. Still, given the rate of progress, it is remarkable to note that computer use is completing a range of work that puts a significant number of human-capable tasks within reach. Anthropic isn't the only company reaching for this particular constellation of capabilities. Google LLC is also baking in computer and browser use into its Gemini model. The company introduced computer use in Gemini 2.5 and OpenAI Group PBC developed a similar paradigm with advanced multistep browser agents, although not quite general computer-level use. In the meantime, the company released Claude Cowork: a MacOS desktop app (with a Windows version coming soon) that allows its AI to read and interact with files on users' computers. It can act as a proactive teammate, capable of controlling the mouse, keyboard and browser to execute multi-step activities such as organizing files, editing documents and browsing the web. Anthropic noted that with full control of a computer, safety can become a major concern. Risks of hijack, prompt injection and other concerns become paramount. The company said that it has been working to improve resistance to hallucination and external manipulation. According to internal safety evaluations, Sonnet 4.6 saw major improvements compared to its predecessor and performed similarly to Opus 4.6. Developers are getting a huge boost from the larger 1 million token context window. Early testers of Claude Code reported that Sonnet 4.6 is capable of reading context before modifying code, consolidates logic instead of duplicating it and avoids overengineering and "laziness" that earlier models suffer from. With 1 million tokens, Sonnet 4.6 is capable of ingesting entire codebases, even extremely large ones, by seeing the entirety of extremely large horizons at once in order to understand full scopes of dependencies at once. This allows it to follow flow paths at longer depths at once. For business use, this has equally useful implications because it can hold lengthy contracts or dozens of research papers in memory at once and reference them as it does work and reasons through them. Within the application programming interface, Sonnet 4.6 now supports both adaptive and extended thinking, as well as compaction in beta. That allows users to quickly select optimized features for cost-to-performance and continual execution, even when the context fills up. Context compaction happens when the context window gets too full and the model needs to summarize the conversation to save space so that it can continue to converse without dropping off the oldest information (therefore "forgetting" the oldest knowledge). Also in the API, Claude's web search and fetch now automatically writes and executes code to filter search results. Code execution, web fetch, memory and programmatic tool calling are also now generally available. That makes it much more useful for application programming in production. Model Context Protocol support for Claude in Excel is available for all users with Pro subscriptions and above, providing support for spreadsheet users.
Share
Share
Copy Link
Anthropic launched Claude Sonnet 4.6, bringing significant upgrades to its free and mid-tier AI model. The upgraded AI model now features a 1 million token context window, improved coding performance that rivals previous Opus versions, and computer use automation at human baseline levels. Developers preferred Sonnet 4.6 over its predecessor 70% of the time in early testing.
Anthropic released Claude Sonnet 4.6 on Tuesday, delivering what the company calls its most capable Sonnet AI model yet
1
. The upgraded AI model closes the performance gap between mid-tier and flagship models while maintaining the same pricing as its predecessor at $3/$15 per million input and output tokens4
. Claude Sonnet 4.6 is now the default model for free and Pro tier users across Claude.ai, Claude Cowork, and Claude Code interfaces2
.
Source: Tom's Guide
The release comes just two weeks after Anthropic unveiled Claude Opus 4.6, its premium flagship model available only to paid subscribers. Tasks that previously required Opus-class intelligence can now run on Sonnet at a fraction of the cost, making advanced AI capabilities accessible to a broader user base
3
.One of the most significant upgrades is the 1 million token context window, currently in beta testing
1
. This expanded capacity allows Claude Sonnet 4.6 to process entire codebases, lengthy contracts, or dozens of research papers in a single session without requiring context compaction or resets4
. The long-context reasoning capabilities enable better understanding of full scopes of dependencies and deeper flow path analysis4
.
Source: SiliconANGLE
For developers working on complex projects, this means the AI model can maintain awareness across massive amounts of information while delivering improved coding performance. Early testers reported that Sonnet 4.6 more effectively reads context before modifying code and consolidates shared logic rather than duplicating it, making it less frustrating during extended coding sessions
2
.Anthropic highlighted major improvements in computer use automation, with Claude Sonnet 4.6 achieving human baseline performance in the OSWorld benchmark
1
. The AI model can now navigate complex spreadsheets, fill out multistep web forms, and pull information together across multiple browser tabs without requiring specific software connectors or tools4
.This capability represents a shift toward AI that operates software the way humans do, working with applications like Chrome, LibreOffice, and VS Code through standard interfaces
3
. Anthropic acknowledged that while the model still lags behind skilled human users, the rate of progress makes computer use substantially more practical for a range of work tasks2
. Google and OpenAI are also developing similar computer and browser use capabilities in their respective models4
.
Source: CNET
In early user testing, developers preferred Claude Sonnet 4.6 over Sonnet 4.5 approximately 70% of the time
2
. When compared to Claude Opus 4.5, the older flagship model released in November, developers chose Sonnet 4.6 roughly 60% of the time2
. Users reported significantly fewer hallucinations, reduced false claims of success, less overengineering, and more consistent follow-through on multi-step tasks2
.The improved coding performance means the AI model can better follow detailed instructions and maintain accuracy across longer development sessions. Anthropic noted that Sonnet 4.6 approaches Opus-level intelligence at a price point that makes it practical for far more tasks
2
.As AI models gain the ability to operate computers autonomously, security risks escalate. Prompt injection attacks, where malicious instructions are hidden in websites or documents that humans won't notice but AI will detect, represent a major hazard
1
. Anthropic reported that Claude Sonnet 4.6 showed significant improvement in resisting prompt injection attacks compared to Sonnet 4.5, performing similarly to Claude Opus 4.6 in internal safety evaluations1
.The company emphasized its focus on improving resistance to hallucination and external manipulation as computer control capabilities expand
4
. This security enhancement matters particularly for business users deploying AI in production environments where reliability and safety are paramount.Related Stories
The API for Claude Sonnet 4.6 now supports adaptive and extended thinking, along with context compaction in beta
4
. Context compaction allows the model to summarize conversations when the context window fills up, enabling continuous operation without dropping the oldest information. Web search and fetch capabilities now automatically write and execute code to filter search results, while code execution, web fetch, memory, and programmatic tool calling have reached general availability4
.Model Context Protocol support for Claude in Excel is now available for Pro subscribers and above, expanding spreadsheet automation capabilities
4
. The free tier has also been upgraded to include file creation, connectors, and context compaction features3
.The release positions Anthropic competitively against OpenAI, which recently launched GPT-5.3-codex as a capable rival to Claude Code
1
. Anthropic has been actively marketing against OpenAI's decision to include ads in free and low-cost ChatGPT plans through a Super Bowl ad campaign1
. Claude has experienced surging popularity recently, with the Claude Code app going viral over the holidays as users discovered its vibe coding capabilities1
.While Claude Opus 4.6 remains the strongest option for tasks demanding the deepest reasoning, such as codebase refactoring and multi-agent coordination
2
, Sonnet 4.6 serves as a practical daily driver that's considerably faster for most real-world productivity work2
. The model's ability to handle massive context while maintaining speed and accuracy suggests that AI is moving closer to functioning as a true digital coworker rather than a typical chatbot3
.Summarized by
Navi
06 Aug 2025•Technology

05 Feb 2026•Technology

25 Feb 2025•Technology

1
Technology

2
Business and Economy

3
Technology
