20 Sources
20 Sources
[1]
Anthropic launches Claude Sonnet 4.5, its best AI model for coding | TechCrunch
On Monday, Anthropic launched a new frontier model called Claude Sonnet 4.5, which it claims will state-of-the-art on coding benchmarks. The company says Claude Sonnet 4.5 is capable of building "production-ready" applications, a leap in reliability from previous AI models. Claude Sonnet 4.5 will be available via the Claude API and in the Claude.ai chatbot. The pricing for developers is the same as Claude Sonnet 4: $3 per million input tokens (roughly 750,000 words, or more than the entire Lord of The Rings series) and $15 per million output tokens. In the last year, Anthropic's AI models have emerged as a favorite among developers and enterprises, in large part due to their strong performance on software engineering tasks. Apple and Meta reportedly use Claude AI models internally, and Anthropic has made a significant business selling API access to AI coding applications such as Cursor, Windsurf, and Replit. Recently, OpenAI's GPT-5 has challenged Anthropic's dominance in the space, outperforming Claude models on a variety of coding benchmarks. Anthropic says Claude Sonnet 4.5 offers industry-leading performance on several coding benchmarks, including SWE-Bench Verified. However, Anthropic AI researcher David Hershey tells TechCrunch that it is hard to capture Claude Sonnet 4.5's performance on benchmarks alone. Hershey says he's seen Claude Sonnet 4.5 code autonomously for up to 30 hours during early trials with some enterprise customers. In that time, he watched the AI model not only build an application, but stand up database services, purchase domain names, and perform a SOC 2 audit to make sure the product was secure. In a statement shared with TechCrunch Cursor CEO Micheal Truell said Claude Sonnet 4.5 represents state-of-the-art coding performance, specifically on longer horizon tasks. Windsurf CEO Jeff Wang said in a statement that Claude Sonnet 4.5 represents a "new generation of coding models." Anthropic also claims that Claude Sonnet 4.5 is its most aligned frontier AI model yet, with lower rates of sycophancy and deception than previous models. The company says it has also improved Claude's susceptibility to prompt injection attacks. Alongside the launch of Claude Sonnet 4.5, Anthropic is also launching the Claude Agent SDK. The company says this is the same infrastructure that powers Claude Code, and can be used to help developers build their own agents. Anthropic is also releasing a temporary research preview called "Imagine with Claude" for Max subscribers, which shows the AI model generating software on the fly. The company says the model will respond to user requests in real time, with no predetermined functionality or prewritten code.
[2]
Anthropic's New Claude Sonnet 4.5 AI Model Promises to Be a Coding Beast
Imad is a senior reporter covering Google and internet culture. Hailing from Texas, Imad started his journalism career in 2013 and has amassed bylines with The New York Times, The Washington Post, ESPN, Tom's Guide and Wired, among others. Claude Sonnet 4.5 is out today and brings major coding improvements, including checkpoints, code execution, file creation and a refreshed terminal to the AI model, Anthropic said in a press release on Monday. Claude Code gains a much-requested feature with the addition of checkpoints, allowing coders to save their progress or roll back to a previous state. Claude can now execute code and create files, such as spreadsheets, slides and documents. On the agent side, the Claude API lets agents run longer and handle more complex tasks. And with the Claude Agent SDK, developers can make their own AI agents that can better manage memory, handle permissions and work with subagents to solve tasks. Don't miss any of our unbiased tech content and lab-based reviews. Add CNET as a preferred Google source. Claude Sonnet 4.5 is the "most aligned frontier model we've ever released," according to Anthropic. This means that Sonnet 4.5 has seen major improvements in "sycophancy, deception, power-seeking and the tendency to encourage delusional thinking." Anthropic says it's also made "considerable progress" in defending against prompt injection attacks, when bad actors use specially crafted language to trick a model into doing things it wasn't meant to do. "Claude Sonnet 4.5 resets our expectations -- it handles 30+ hours of autonomous coding, freeing our engineers to tackle months of complex architectural work in dramatically less time while maintaining coherence across massive codebases," Sean Ward, CEO of iGent AI, said in a press release. Claude Sonnet 4.5 comes as the AI race is heating up. While much attention has been given to OpenAI's ChatGPT and Google's Gemini, players like Anthropic, too, have been pushing AI technology forward. Fans appreciate Claude for its coding ability and the chatbot's conversational nature. In GDPval, a benchmarking tool made by OpenAI, Claude Opus 4.1 was the most performant model, beating GPT-5. It could be why OpenAI was caught using Claude Code and subsequently had its access removed for violating Anthropic's terms of service. OpenAI responded by saying it's standard in the industry to evaluate competing models for accuracy and safety and that its API would still be made available to Anthropic. In August, the two companies announced the results of a joint exercise in which each company evaluated the other's models. (Disclosure: Ziff Davis, CNET's parent company, in April filed a lawsuit against OpenAI, alleging it infringed Ziff Davis copyrights in training and operating its AI systems.) As Anthropic continues to excel in certain areas, it's raising billions in capital. Recently, Anthropic completed its series F fundraising round of $13 billion and is now valued at $183 billion. Anthropic also settled a $1.5 billion lawsuit with authors for illegally pirating their work earlier this month. OSWorld, a tool that tests how AI models perform in real-world computer tasks, benchmarked Sonnet 4.5 at 61.4%, whereas Sonnet 4 was 42.2% four months prior. The Claude for Chrome extension, which is currently available to those who signed up for the waitlist last month, takes advantage of Sonnet 4.5's agentic improvements.
[3]
Anthropic releases Claude Sonnet 4.5 in latest bid for AI agents and coding supremacy
Anthropic's latest AI model spent 30 hours running by itself to code a chat app akin to Slack or Teams. It spat out about 11,000 lines of code, according to Anthropic, and it only stopped running when it had completed the task. The model, Claude Sonnet 4.5, was announced today, and its ability to operate autonomously for 30 hours straight is a huge jump forward. Before, the company's Opus 4 model made headlines in May for its ability to operate for seven hours. It's all a significant step in Anthropic's battle to corner the market on both AI agents and AI coding. The company called Claude Sonnet 4.5 "the best model in the world for real-world agents, coding, and computer use" and said it "leads the market at using computers," referencing the Computer Use feature Anthropic debuted nearly a year ago. The new model is particularly adept in fields like cybersecurity, financial services, and research, according to Anthropic. One of its beta-testers, Canva, said the new model helped with "complex, long-context tasks -- from engineering in our codebase to in-product features and research." Anthropic, OpenAI, Google, and other companies have been continuously releasing incremental updates and features that allow their technology to act as an assistant both for consumers (researching topics, scheduling meet-ups, and looking up flights) and for enterprise and developer use (creating slide decks, helping with coding tasks, and analyzing spreadsheets). The battle for attention and reliance heats up nearly every month, if not every week. Days ago, OpenAI announced Pulse, its newest ChatGPT feature designed to be part of users' morning routines and research topics relevant to their days. Anthropic also said the new model would be paired with other updates to help developers code their own AI agents. "We're combining the launch of the model with access to virtual machines, memory, context management, and multi-agent support," the company wrote in a release. "This essentially packages the same building blocks that power Claude Code - enabling developers to build their own cutting-edge agents." Dianne Penn, a head of product management at Anthropic, told The Verge in an interview that the model's improvements in its computer use capabilities surprised even her. Claude Sonnet 4.5 is more than three times as skilled at navigating a browser and using a computer compared to Anthropic's tech from last October. Penn said the team had received feedback from early-access customers -- "the GitHubs and Cursors of the world" -- and spent the past month working intensively on the model. Scott White, product lead for Claude.ai, told The Verge that the new model operates at "chief-of-staff level" and can find availability between multiple peoples' calendars and schedule a meeting, look at a data dashboard and pull together insights, write status updates based on one-on-one meetings with his direct reports, and more. Neither White nor Penn had yet tried vibe-coding with the new model when The Verge spoke to them. But Penn said she uses Claude Sonnet 4.5 for hiring potential new team members at Anthropic. "It's been actually really helpful to have a continuous running prompt that I use of, 'Do a deep web search, come up with like these parameters for profiles to source for certain types of roles on my team,'" Penn said. "That's been really, really helpful. And I've seen the Sonnet 4.5 just do even better than in the past, on the quality and the depth of the searches and actually generating a spreadsheet with LinkedIn profiles so then I can email them."
[4]
Claude Sonnet 4.5 could be your next breakthrough coding tool - how to access it today
Anthropic's latest model, Claude Sonnet 4.5, is here. It scored very highly on coding benchmarks. Claude Code also got long-awaited upgrades. Anthropic's coding tools have become well-regarded amongst developers, with its Claude 4 Sonnet model, released in May, serving as a free and reliable coding assistant for many. Just months later, Anthropic has released its next-generation model, featuring upgrades to its performance across the board. Also: Anthropic's free Claude Sonnet 4 aced my coding tests - but its paid Opus model somehow didn't Claude Sonnet 4.5 is now available, the company said Wednesday, claiming it is the "best coding model in the world" as well as the best model for building complex agents, utilizing computers, and gaining reasoning and mathematical capabilities. Anthropic also launched updates across its Claude Code offering, Claude for Chrome extension, and more. If you have been closely tracking Anthropic's releases, you may recall that in May, Claude Opus 4 and Sonnet 4 scored highest amongst frontier models on the industry-standard software engineering benchmark test (SWE-bench), which evaluates LLMs' abilities to solve real-world software engineering tasks sourced from GitHub. Claude Opus 4.1, released in August, surpassed it. Now, Claude Sonnet 4.5 has lapped that last model, outperforming it on the SWE-bench Verified evaluation, a human-filtered subset of the SWE-bench. Claude Sonnet 4.5 also outperformed leading models from competitors, including GPT-5 Codex, GPT-5, and Gemini 2.5 Pro. Anthropic said that on the SWE-bench Verified, Sonnet 4.5 held its focus for more than 30 hours on complex, multi-step tasks. This capability is specifically useful for agentic tasks, which oftentimes require solo work in the background for extended periods of time. Also: I got 4 years of product development done in 4 days for $200, and I'm still stunned Other improvements include its performance on computers, as indicated by its score on the OSWorld benchmark, which tests the performance of AI models on real-world computer tasks: "Sonnet 4.5 now leads at 61.4%. Just four months ago, Sonnet 4 held the lead at 42.2%," Anthropic noted in the release. The Claude for Chrome extension, now rolled out to everyone who joined the waitlist last month, uses these capabilities. Anthropic also saw improvements across math and reasoning. Also: I teamed up two AI tools to solve a major bug - but they couldn't do it without me Claude Sonnet 4.5 is also Anthropic's "most aligned" frontier model yet, according to the company. This means it's the model that adheres most closely to humans' instructions and intended use cases, and that has reduced instances of behaviors such as sycophancy and deception. The model is also better at resisting prompt injection attacks and has AI Safety Level 3 (ASL-3) protections on Anthropic's model framework. Claude Sonnet 4.5 is available everywhere, including in the Claude.ai chatbot. Of course, developers and professionals can access the new model in the API and Claude Code, and for the same price as Sonnet 4. Anthropic also upgraded its other coding offerings, starting with Claude Code, which now has checkpoints that allow users to save progress and revisit a previous state. It also has what Anthropic is calling a "refreshed" terminal interface and a native VS Code extension. Also: Researchers from OpenAI, Anthropic, Meta, and Google issue joint AI safety warning - here's why Anthropic also launched the Claude Agent SDK, which is the same infrastructure that powers Claude Code, allowing developers to build their own agents with it. The Claude Code API has introduced a new context editing feature and a memory tool that enables agents to work more efficiently and tackle more complex problems, according to the company. The company also upgraded Claude apps so they can execute code and create files in chat.
[5]
Anthropic Debuts Claude Sonnet 4.5 With More Coding, Less 'Deception'
Emily is an experienced reporter who covers cutting-edge tech, from AI and EVs to brain implants. She stays grounded by hiking and playing guitar. Anthropic released a new AI model, Claude Sonnet 4.5, with stronger coding abilities and new features to streamlines the process. It's "the best coding model in the world," Anthropic says. The company also also referred to its last model as "the world's best" for coding. But, unsurprisingly, it says the newest one is a cut above. A major new feature is "checkpoints," which allows programmers to save their progress and revert to a previous version. Other additions include a new terminal interface, context adjusting capabilities, and file creation (spreadsheets, slides, and documents) without leaving the chat window, which is now available with all paid plans. Developers can create their own AI agents using the Claude Agent SDK. "The infrastructure that powers our frontier products -- and allows them to reach their full potential -- is now yours to build with," says Anthropic. In the company's internal evaluations, Sonnet 4.5 earned a score of 77.2% on the "agentic coding," or self-directed coding, test. That's compared to a slightly lower 74.5% for its predecessor Opus 4.1 and Codex, the programming tool within OpenAI's GPT-5. But remember, these are internal benchmarks. It's always worth trying multiple tools to find the one that's best for you. Happily, Anthropic did not increase the price of Sonnet 4.5 through the developer API, which remains the same as Sonnet 4, at $3/$15 per million tokens. Beyond coding, Anthropic is positioning Claude as the go-to chatbot for all workplace tasks. That differentiates it from ChatGPT, which is used for non-work-related conversations over 70% of the time, according to an OpenAI study released this month. Anthropic lists financial services, cybersecurity, and law as other fields its chatbot excels in. Claude Sonnet 4.5 "creates presentations, spreadsheets, and PDFs you'd actually be proud to share with your boss or clients and sharpens your thinking on complex problems," Anthropic says. The AI can even use your computer for you, performing simple tasks like navigating websites and filling out spreadsheets. It works in Google Chrome with an extension, which is now available for those with a Max plan ($100 to $200-per-month) to sign up for through a waitlist. However, these capabilities are still nascent and flawed, just like agentic coding. Anthropic says it bolstered Claude's defenses against prompt injection attacks when in computer use mode, a type of cyberattack that represents "one of the most serious risks." Claude, like all chatbots, can also be kind of a jerk. It's prone to "sycophancy, deception, power-seeking, and the tendency to encourage delusional thinking," in certain scenarios. It's known to do things like "praise obviously-terrible business ideas" and confirm to users that they are indeed the Matrix, according to the system card for Sonnet 4.5. Anthropic says Sonnet 4.5 is its least likely model to engage in these behaviors, and that it expects it to be "much more direct and much less likely to mislead users than any recent popular large language model (LLM)." When OpenAI and Anthropic evaluated each other's models over the summer, OpenAI reported Claude was less likely to engage in sycophantic and harmful behaviors than ChatGPT. Over the past year, Claude has emerged as a favorite LLM among individuals and businesses (not book authors). Apple and Meta reportedly use Claude internally, TechCrunch reports. You may start to see Anthropic advertisements on streaming platforms like Netflix and Hulu, and at live sporting events as well since company launched its first major advertising campaign this fall, AdWeek reports.
[6]
Anthropic Says New Model Can Code On Its Own for 30 Hours Straight
Anthropic is releasing a new artificial intelligence model that is designed to code longer and more effectively than prior versions, its latest attempt to stay ahead of rivals like OpenAI in offering tools for software developers. The new model, called Claude Sonnet 4.5, is better at following instructions and can code on its own for up to 30 hours straight, the company said on Monday. By comparison, a previous model called Claude Opus 4 is said to be able to field coding tasks for up to seven hours by itself. The updated version of Sonnet is also intended to excel at using a person's computer to take actions for them, improving on a feature Anthropic introduced a year ago. Anthropic has been an early leader in building so-called AI agents that field complex tasks on a user's behalf, particularly for streamlining the process of writing and debugging code. The company, now valued at $183 billion, reached $5 billion in run-rate revenue in August, fueled in part by traction for its coding software. But other companies, including OpenAI and Alphabet Inc.'s Google, are also vying to win over programmers with similar capabilities. Anthropic's latest release comes a week before OpenAI is set to hold its annual developer event. Jared Kaplan, Anthropic's co-founder and chief science officer, said Sonnet 4.5 is "stronger in almost every way" than its most recent high-end Opus model. Anthropic is also working to build a better version of the Opus model, which he expects will likely come out later this year. "We get benefits from having usage at both model sizes."
[7]
Anthropic launches Claude 4.5, touts better abilities, targets business customers
Sept 29 (Reuters) - Anthropic unveiled the Claude 4.5 AI model on Monday, saying the newest version can code for longer uninterrupted stretches and handle finance and scientific tasks better, as the startup pushes deeper into enterprise AI. The Alphabet (GOOGL.O), opens new tab and Amazon.com-backed (AMZN.O), opens new tab AI startup is racing rivals to build models that can reliably operate software and complete multi-step work, key for AI agents, which can perform tasks on behalf of humans. The Sonnet 4.5 model created a web app from scratch in internal tests, and one customer had the AI chatbot code autonomously for 30 hours, up from a seven-hour run achieved by Anthropic's earlier Claude Opus 4 for a different client, Chief Product Officer Mike Krieger said. Anthropic is targeting power users and business customers rather than chasing a viral consumer moment, he said. Claude 4.5 is stronger at finance and scientific reasoning and better at using computers, scoring about 60% on a benchmark that tests operating-system dexterity versus roughly 40% for prior models, the company said. "It's a lot more visceral when you just see the model using a computer the way a person does if you're not a coder," said Chief Science Officer Jared Kaplan. Separately on Monday, Microsoft said it would add new Microsoft 365 Copilot features powered by Anthropic models, including "Agent Mode" in Excel and Word and an "Office Agent" in Copilot chat, with PowerPoint to follow. Microsoft last week said it would bring Anthropic's models to Microsoft 365 Copilot to diversify beyond longtime partner OpenAI. Anthropic, founded by former OpenAI executives, has positioned Claude for workplace use with guardrails it says reduce risky outputs. The company has been marketing Claude's coding and data-analysis skills to regulated industries and teams that want models to work across multiple software tools. Krieger said the company's focus is on sustained, reliable performance over long tasks rather than short demos. Reporting by Jeffrey Dastin, Deepa Seetharaman in San Francisco and Akash Sriram in Bengaluru; Editing by Anil D'Silva Our Standards: The Thomson Reuters Trust Principles., opens new tab * Suggested Topics: * Artificial Intelligence Jeffrey Dastin Thomson Reuters Jeffrey Dastin is a correspondent for Reuters based in San Francisco, where he reports on the technology industry and artificial intelligence. He joined Reuters in 2014, originally writing about airlines and travel from the New York bureau. Dastin graduated from Yale University with a degree in history. He was part of a team that examined lobbying by Amazon.com around the world, for which he won a SOPA Award in 2022.
[8]
Claude Sonnet 4.5 is Anthropic's safest AI model yet
In May, Anthropic announced two new AI systems, Opus 4 and Sonnet 4. Now, less than six months later, the company is introducing Sonnet 4.5, and calling it the best coding model in the world to date. Anthropic's basis for that claim is a selection of benchmarks where the new AI outperforms not only its predecessor but also the more expensive Opus 4.1 and competing systems, including Google's Gemini 2.5 Pro and GPT-5 from OpenAI. For instance, in OSWorld, a suite that tests AI models on real-world computer tasks, Sonnet 4.5 set a record score of 61.4 percent, putting it 17 percentage points above Opus 4.1. At the same time, the new model is capable of autonomously working on multi-step projects for more than 30 hours, a significant improvement from the seven or so hours Opus 4 could maintain at launch. That's an important milestone for the type of agentic systems Anthropic wants to build. Perhaps more importantly, the company claims Sonnet 4.5 is its safest AI system to date, with the model having undergone "extensive" safety training. That training translates to a chatbot Anthropic says is "substantially" less prone to "sycophancy, deception, power-seeking and the tendency to encourage delusional thinking" -- all potential model traits that have landed OpenAI in hot water in recent months. At the same time, Anthropic has strengthened Sonnet 4.5's protections against prompt injection attacks. Due to the sophistication of the new model, Anthropic is releasing Sonnet 4.5 under its AI Safety Level 3 framework, meaning it comes with filters designed to prevent potentially dangerous outputs related to prompts around chemical, biological and nuclear weapons. With today's announcement, Anthropic is also rolling out quality of life improvements across the Claude product stack. To start, Claude Code, the company's popular coding agent, has a refreshed terminal interface, with a new feature called checkpoints included. As you can probably guess from the name, they allow you to save your progress and roll back to a previous state if Claude writes some funky code that isn't quite working like you imagined it would. File creation, which Anthropic began rolling out at the start of the month, is now available directly in conversations with the chatbot, and if you joined the waitlist Claude for Chrome, you can start using the extension today.
[9]
Anthropic launches Claude Sonnet 4.5, its latest AI model that's 'more of a colleague'
Anthropic on Monday announced its latest artificial intelligence model: Claude Sonnet 4.5. The model is better at coding, using computers and meeting practical business needs, and it excels in specialized fields like cybersecurity, finance and research, Anthropic said. The Amazon-backed startup, which is valued at $183 billion, is making Claude Sonnet 4.5 available to all users. Anthropic said Claude Sonnet 4.5 is the "best coding model in the world" according to industry benchmarks like SWE-bench Verified, a test set that measures an AI system's software coding abilities. "People are just noticing with this model, because it's just smarter and more of a colleague, that it's kind of fun to work with it when encountering problems and fixing them," Jared Kaplan, Anthropic's co-founder and chief science officer, told CNBC in an interview. The model generates higher-quality code, is better at identifying code improvements and can follow instructions more reliably, the company said.
[10]
Claude Sonnet 4.5 can code for 30 hours straight -- and it could change the future of work forever
Anthropic has just announced Claude Sonnet 4.5 and is calling it the "best in the world" for coding, real-world agent and complex computer use. In internal testing, Sonnet 4.5 ran autonomously for more than 30 hours straight while maintaining performance and focus. This is a giant leap from the seven hours possible with Claude Opus 4 just months ago. With nearly a full work week of nonstop AI effort, this new model underscores the possibilities of where the future of work and personal productivity might be headed. With vibe coding so easy that anyone can do it, Claude Sonnet 4.5 hints at how AI could soon handle everyday tasks more reliably. Instead of just spitting out snippets of code or short answers, this model can stay focused for hours, which means it's finally practical for real projects or just about anything else they are prompted to do. From apps and websites to so much more, here's what is now possible: Claude Sonnet 4.5 is here just four months after Sonnet 4, highlighting just how quickly AI is evolving. Instead of being limited to short bursts, it now sustains output across multi-day projects. Anthropic positions it as both more powerful and safer, with a balance of speed and cost that could attract not just businesses, but everyday users who want a reliable assistant for demanding projects. Anthropic could potentially win over Gemini and ChatGPT users, especially those who want AI to tackle their biggest and most complex workloads with speed and efficiency. Because Claude Sonnet 4.5 can sustain effort, remember context, and interact with the tools you already use - all with human-like conversations - the dream of a true digital sidekick is quickly becoming a reality.
[11]
Anthropic's Claude Sonnet 4.5 can work autonomously for 30 hours
Why it matters: To act as an agent, AI models must sustain work on a single task for hours -- something many earlier models couldn't do. Driving the news: The new version of Claude can work for 30 hours or more on its own, a big step up from the seven hours of autonomous work with Claude Opus 4. * Beyond math and coding, where Claude has previously excelled, Sonnet 4.5 is strong on tasks requiring research and diligence, Scott White, a product lead at Anthropic, told Axios. * The company offered a variety of benchmarks and customer comments touting the power and performance of the new model, which is priced the same as its predecessor, Claude Sonnet 4.0. * Anthropic will give developers access to Claude Code's building blocks -- virtual machines, memory and context management -- to make it easier to create Claude-powered agents. The big picture: Anthropic said the rapid progress, marked by major Sonnet updates in February and May, shows a pattern where every six months its new model can handle tasks that are twice as complex. * "This is a continued evolution on Claude, going from an assistant to more of a collaborator to a full, autonomous agent that's capable of working for extended time horizons," White said. Between the lines: Anthropic also claims that the new Sonnet is the company's "most-aligned" model yet.
[12]
Anthropic's new Claude can code for 30 hours. Think of it as your AI coworker
Anthropic launched Claude Sonnet 4.5 on Monday, positioning the artificial intelligence model as "the best coding model in the world" in a direct challenge to OpenAI's recently released GPT-5, as the two AI giants battle for dominance in the lucrative enterprise software development market. The San Francisco-based startup claims its newest model achieves state-of-the-art performance on critical coding benchmarks, scoring 77.2% on SWE-bench Verified -- a rigorous software engineering evaluation -- compared to GPT-5's performance. More remarkably, Anthropic says Claude Sonnet 4.5 can maintain focus on complex, multi-step tasks for more than 30 hours, a dramatic leap in AI's ability to handle sustained work. "Sonnet 4.5 achieves 77.2% on SWE-bench Verified (82% with parallel test-time compute). It is SOTA," an Anthropic spokesperson told this reporter, using industry shorthand for "state of the art." The company also highlighted the model's 50% score on Terminal-bench, another coding benchmark where it claims leadership. The announcement follows mounting pressure from OpenAI's recent advances and pointed criticism from high-profile figures like Elon Musk, who recently posted on X.com that "winning was never in the set of possible outcomes for Anthropic." When asked about Musk's statement, Anthropic declined to comment. The release arrives just seven weeks after OpenAI's GPT-5 launch in August, underscoring the breakneck pace of competition in artificial intelligence as companies race to capture enterprise customers increasingly relying on AI for software development. The timing is particularly noteworthy as Anthropic grapples with questions about its heavy dependence on just two major customers. Anthropic dominates coding market despite customer concentration risks The competition centers on a market that has emerged as AI's first major profitable use case beyond chatbots. Anthropic commands 42% of the code generation market -- more than double OpenAI's 21% share -- according to a Menlo Ventures survey of 150 enterprise technical leaders. That dominance has translated into remarkable financial performance, with the company reaching a $5 billion revenue run rate earlier this year. However, industry analysis reveals that coding applications Cursor and GitHub Copilot drive approximately $1.4 billion of Anthropic's revenue, creating a potentially dangerous customer concentration that could leave the company vulnerable if either relationship falters. "Our run-rate revenue has grown significantly, even when you exclude these two customers," the Anthropic spokesperson said, pushing back on concerns about customer concentration. The company provided supportive quotes from both Cursor CEO Michael Truell and GitHub Chief Product Officer Mario Rodriguez praising Claude Sonnet 4.5's performance. The new model achieves significant advances in computer use capabilities, scoring 61.4% on OSWorld, a benchmark that tests AI models on real-world computer tasks. Just four months ago, Claude Sonnet 4 held the lead at 42.2%, demonstrating rapid improvement in AI's ability to interact with software interfaces. OpenAI's aggressive pricing strategy threatens Anthropic's premium positioning Anthropic's announcement comes as the company grapples with competitive pressure from GPT-5's aggressive pricing strategy. Early analysis shows Claude Opus 4 costing roughly seven times more per million tokens than GPT-5 for certain tasks, creating immediate pressure on Anthropic's premium positioning. The pricing disparity signals a fundamental shift in competitive dynamics that could force enterprise procurement teams to reconsider vendor relationships previously built on performance rather than price. Companies managing exponentially growing AI budgets now face comparable capability at a fraction of the cost. Yet Anthropic is maintaining its pricing strategy with Claude Sonnet 4.5. "Sonnet 4.5's cost remains the same as Sonnet 4," the spokesperson confirmed, keeping prices at $3 per million input tokens and $15 per million output tokens. Claude Sonnet 4.5 delivers 30-hour autonomous work sessions and enhanced security Beyond performance improvements, Anthropic positions Claude Sonnet 4.5 as its "most aligned frontier model yet," showing significant reductions in concerning behaviors like sycophancy, deception, and power-seeking tendencies. The company has made "considerable progress on defending against prompt injection attacks," a critical security concern for enterprise deployments. The model is being released under Anthropic's AI Safety Level 3 (ASL-3) protections, which include classifiers designed to detect potentially dangerous inputs and outputs related to chemical, biological, radiological, and nuclear weapons. While these safeguards sometimes flag normal content, Anthropic says it has reduced false positives by a factor of ten since initially describing them. Perhaps most significantly for developers, Anthropic is releasing the Claude Agent SDK -- the same infrastructure that powers its Claude Code product. "We built Claude Code because the tool we needed didn't exist yet," the company said in its announcement. "The Agent SDK gives you the same foundation to build something just as capable for whatever problem you're solving." International expansion accelerates as $1.5 billion copyright settlement finalizes The model launch coincides with Anthropic's aggressive international expansion, as the company seeks to diversify beyond its U.S.-concentrated customer base. The startup recently announced plans to triple its international workforce and expand its applied AI team fivefold in 2025, driven by data showing that nearly 80% of Claude usage now comes from outside the United States. However, the expansion comes amid significant legal costs. Anthropic recently agreed to pay $1.5 billion in a copyright settlement with authors and publishers over allegations the company illegally used their books to train AI models without permission. The settlement, approved by a federal judge last week, requires payments of $3,000 for each publication listed in the case. Enterprise AI spending doubles as companies prioritize performance over cost The rapid-fire model releases from both companies reflect the high stakes in enterprise AI adoption. Model API spending has more than doubled to $8.4 billion in just six months, according to Menlo Ventures, as enterprises shift from experimental projects to production deployments. Customer behavior patterns suggest enterprises consistently prioritize performance over price, upgrading to the newest models within weeks of release regardless of cost. This behavior could work in Anthropic's favor if Claude Sonnet 4.5's performance advantages prove compelling enough to overcome GPT-5's pricing advantage. However, the dramatic price differential introduced by GPT-5 could overcome typical switching inertia, especially for cost-conscious enterprises facing budget pressures. Industry observers note that model switching costs remain relatively low, with 66% of enterprises upgrading within existing providers rather than switching vendors. For enterprises, the intensifying competition delivers better performance and lower costs through continuously improving capabilities. The rapid pace of model improvements -- with new versions launching monthly rather than annually -- provides organizations with expanding AI capabilities while vendors compete aggressively for their business. While the corporate rivalry between Anthropic and OpenAI dominates industry headlines, the real economic impact extends far beyond Silicon Valley boardrooms. The development of AI systems capable of sustained coding work for 30 hours represents a fundamental shift in how software gets built, with implications that extend across every industry relying on technology infrastructure. These advancing capabilities signal broader workplace transformation ahead. As AI systems demonstrate increasing proficiency at complex, sustained intellectual work, the technology industry's competition for coding supremacy foreshadows similar disruptions across fields requiring analytical thinking, problem-solving, and technical expertise.
[13]
Anthropic launches Claude Sonnet 4.5 -- 'best coding model in the world'
Anthropic has formally announced Claude Sonnet 4.5, a new AI model specifically made for coding. Anthropic didn't mince any words during its announcement, calling Claude Sonnet 4.5 the "best coding model in the world." Starting today, it'll be powering Claude Code, a popular choice for vibe coders and professionals alike. The new model is a step up from the old models and seems to be able to do quite a lot. Per Anthropic co-founder and CEO Scott Wu, the new model features "the biggest jump we've seen since the release of Claude Sonnet 3.6" and can "run longer, handle harder tasks, and deliver production-ready code." Anthropic shows this through a variety of charts demonstrating how effective the model can be. For example, Claude Sonnet 4.5 has a lower instance of misaligned behaviors than its direct competitors, including older models from Anthropic. Anthropic AI researcher David Hershey told TechCrunch that he's seen the model code for 30 hours without interruption in early trials, so overall performance may be difficult to show in benchmarks. In addition to better smarts, Anthropic also announced several new features for Claude Code to coincide with the release. That includes checkpoints, a feature that Anthropic says has been requested quite a lot. Checkpoints will save snapshots of the code the user is working on, and then grant the ability to roll back to a prior checkpoint if things go off the rails. There is also a new context editing feature and memory tool that allows AI agents to run longer and handle more complex instructions. Anthropic has been on a roll in 2025, as have most AI companies. New models seem to drop every couple of months like clockwork these days. Anthropic's prior big model release, Claude Opus 4, launched in May 2025, which was also designed for advanced coding. OpenAI launched its latest GPT-5 in early August, and Google joined the fray with Gemini 2.5 over the summer. Thus, AI fans have a lot of new stuff to check out if they haven't done so in a while.
[14]
Anthropic's Claude Sonnet 4.5 is available now - 'the best AI model in the world for real-world agents, coding, and computer use'
Claude Code now has checkpoints and you can create files directly from within Claude's chatbot Anthropic has released Claude Sonnet 4.5, the next generation of its incredibly popular AI model. The company is calling Sonnet 4.5 "the best coding model in the world" and claims it's the "strongest model for building complex agents" and for "using computers." The new upgrade to Claude launches alongside other upgrades to Anthropic's most popular products, including checkpoints for Claude Code so you can save your progress and roll back to previous states, as well as code execution and file creation for spreadsheets, slides, and documents from within your conversation. These capabilities will be available on all paid plans. Alongside these big announcements, Anthropic also confirmed that Claude for Google Chrome will become available to everyone who had previously joined the waitlist. Anthropic says Sonnet 4.5 is the company's "most aligned model yet" and claims the new upgrade will "substantially improve the model's behavior, reducing concerning behaviors like sycophancy, deception, power-seeking, and the tendency to encourage delusional thinking." Claude Sonnet 4.5 is available today, and prices remain the same per million tokens as the previous model Sonnet 4. OpenAI's latest study shows Claude Sonnet 4.1 beats GPT-5, Gemini, and Grok in real-world job tasks. Now Anthropic says Sonnet 4.5 shows "clear progress" over its predecessor, and not only does the new model achieve higher performance in benchmarks, but it is also "better at meeting practical business needs than its predecessor, allowing our customers to do more, solve harder problems, and be more creative." Sonnet 4.5 can now help users build custom agents using natural language, and Anthropic even claims that the model can work autonomously for 30 hours, a massive improvement over Claude Opus 4's 7-hour capability. Anthropic shared some quotes from its clients during the reveal of Sonnet 4.5, and the model has been met by early adopters with glowing praise. Danny Wu, Head of AI Products at Canva, said, "Claude Sonnet 4.5 delivers impressive gains on our most complex, long-context tasks -- from engineers in our codebase to in-product features and research. It's noticeably more intelligent, helping us push what 240M+ users can design with Canva." While Sean Ward, CEO of iGent AI, said the new model, "resets our expectations -- it handles 30+ hours of autonomous coding, freeing our engineers to tackle months of complex architectural work in dramatically less time while maintaining coherence across massive codebases." Claude Sonnet 4.5 looks set to be one of the biggest AI releases of the year, and while we're yet to try it ourselves, it sure does sound incredibly promising.
[15]
Anthropic releases Claude 4.5, a model it says can build software and accomplish business tasks autonomously | Fortune
Anthropic has launched Claude Sonnet 4.5, its newest AI model, claiming significant advancements in autonomous work and coding. The company said that the model was able to run autonomously for 30 hours, maintaining sustained focus with minimal oversight while building an entire software application. It's a significant improvement over the company's previous Opus 4 model, released four months ago, which could operate autonomously for only seven hours. Anthropic said Claude Sonnet 4.5 also outperformed Opus on key benchmarks and was more effective in meeting customers' practical business needs. The company said the model was even better at coding than previous frontier models, and state-of-the-art on SWE-Bench Verified, a key benchmark that tests how models perform at software development tasks. Anthropic said that Claude Sonnet 4.5 was better than its predecessors at following instructions, identifying code improvements, and generating more production-ready code. When tested on tasks from the financial services industry, the company said the new model outperformed earlier Claude models in tasks such as researching, building financial models, and forecasting. Anthropic appears to be pushing further ahead of its competitors in coding assistance and autonomous task completion, positioning its models toward corporate and workplace use. The company's previous Claude 4.1 Opus model already bested competitors on OpenAI's new benchmark of professional task completion, GDPval, which tested how models performed compared to human professionals across a range of industries and jobs. Last week, OpenAI said its GPT-5 model and Anthropic's Claude Opus 4.1 were "already approaching the quality of work produced by industry experts." Dueling usage studies released earlier this month also suggested that Anthropic's Claude models were emerging as more professionally-oriented AI models, especially in comparison to OpenAI's ChatGPT, which is increasingly being used as a consumer product. According to the study, most Claude users were turning to the models for workplace or productivity tasks, with mathematical tasks and coding cited as the dominant activities globally for Claude.ai, and making up 36% of all use cases. Business use of Claude leaned heavily toward task automation. According to the study, approximately 77% of prompts that the model receives through its API -- the application programming interface that is primarily used by enterprise customers -- involve users requesting the system to perform tasks on their behalf, rather than just providing advice or suggestions. These business-focused interactions are also concentrated in coding, which accounts for 44% of API use. A further 5% of API usage was dedicated to developing or evaluating AI systems. The tasks that business users automate also tend to be the most expensive ones to run. The findings indicate a shift in how businesses approach these tools. Rather than using them mainly for decision support or research, many teams are relying on them to take work off their plates entirely. If models like Claude are able to become more capable of autonomous work, especially in complex, time-intensive domains like software engineering, the implications for businesses and employees could be significant. Autonomous agents can reduce the need for constant human oversight and lower costs on repetitive workflows, speeding up a company's operations and potentially reducing the need for headcount.
[16]
Anthropic Launches Claude Sonnet 4.5, Touts It as 'Best Coding Model in the World' | AIM
Claude Sonnet 4.5 achieved top scores on the SWE-bench Verified evaluation, which tests real-world software coding skills. Anthropic on Monday announced the release of Claude Sonnet 4.5, its latest AI model for coding and agent-based tasks. The company said the model demonstrates improvements in reasoning, math, and long-duration task management. "Claude Sonnet 4.5 is the best coding model in the world. It's the strongest model for building complex agents," the company said in its blog post. "It's also the best model at using computers and shows substantial gains in reasoning and math." The model is available via the Claude API at the same pricing as Sonnet 4, $3 per million tokens for standard use and $15 per million for extended use. Anthropic said that the Claude API has added context editing and memory tools to support longer tasks, and the Claude apps now allow code execution and file creation directly within conversations. Anthropic also released the Claude for Chrome extension for Max users on the waitlist. Claude Sonnet 4.5 is also integrated into Claude Code, which now includes checkpoints to save progress and roll back to previous states, a refreshed terminal interface, and a native VS Code extension. Developers can access the Claude Agent SDK, which provides the infrastructure used internally to build Claude Code. "The Agent SDK gives you the same foundation to build something just as capable for whatever problem you're solving," the spokesperson said. Claude Sonnet 4.5 achieved top scores on the SWE-bench Verified evaluation, which tests real-world software coding skills. On OSWorld, a benchmark for real-world computer tasks, the model scored 61.4%, up from 42.2% for Claude Sonnet 4. Early users reported improved performance across finance, law, medicine, and STEM domains. The company emphasised safety and alignment improvements, noting reductions in misaligned behaviour such as sycophancy, deception, and power-seeking. The model is released under Anthropic's AI Safety Level 3 framework, which includes classifiers to flag potentially dangerous content. Anthropic also introduced a temporary research preview, "Imagine with Claude," which allows users to see the model generate software in real time. It is available to Max subscribers for five days at claude.ai/imagine.
[17]
Anthropic Says Its Latest Claude AI Is 'the Best Coding Model in the World'
Anthropic has announced Claude Sonnet 4.5, the latest version of its default model. The company says the model isn't just "the best coding model in the world," it's also "the strongest model for building complex agents." In the context of AI, an agent is an AI model that uses tools that allow it to take actions, like running code and taking over an internet browser. Anthropic said that when it comes to coding, Sonnet 4.5 is better at both identifying small improvements and considering larger changes to code, and follows instructions more directly when coding on users' behalf. In data shared with Inc., Anthropic claimed that the new model exhibited state-of-the-art performance across a wide variety of benchmarks. For example, on SWE-Bench Verified, a widely-used benchmark that measures an AI model's ability to solve real-world software engineering tasks, Sonnet 4.5 was able to successfully solve 77.2 percent of tasks, up from the 74.5 percent solved by Claude Opus 4.1, a larger and much more expensive model released in August. AI agents built using Sonnet 4.5 will also be a step up thanks to a new software development kit (SDK) called Claude Agent SDK. The SDK gives developers access to the same agentic tools used by the company's popular coding agent, Claude Code. These tools enable developers to easily build Sonnet 4.5-based agents that can read and write files, manage context while working on long-running tasks, run code, search the web, pass on context from one agent to another, and coordinate multiple sub-agents to work on tasks simultaneously.
[18]
Anthropic launches Claude 4.5, touts better abilities, targets business customers - The Economic Times
The Alphabet and Amazon.com-backed AI startup is racing rivals to build models that can reliably operate software and complete multi-step work, key for AI agents, which can perform tasks on behalf of humans.Anthropic unveiled the Claude 4.5 AI model on Monday, saying the newest version can code for longer uninterrupted stretches and handle finance and scientific tasks better, as the startup pushes deeper into enterprise AI. The Alphabet and Amazon.com-backed AI startup is racing rivals to build models that can reliably operate software and complete multi-step work, key for AI agents, which can perform tasks on behalf of humans. The Sonnet 4.5 model created a web app from scratch in internal tests, and one customer had the AI chatbot code autonomously for 30 hours, up from a seven-hour run achieved by Anthropic's earlier Claude Opus 4 for a different client, Chief Product Officer Mike Krieger said. Anthropic is targeting power users and business customers rather than chasing a viral consumer moment, he said. Claude 4.5 is stronger at finance and scientific reasoning and better at using computers, scoring about 60% on a benchmark that tests operating-system dexterity versus roughly 40% for prior models, the company said. "It's a lot more visceral when you just see the model using a computer the way a person does if you're not a coder," said Chief Science Officer Jared Kaplan. Separately on Monday, Microsoft said it would add new Microsoft 365 Copilot features powered by Anthropic models, including "Agent Mode" in Excel and Word and an "Office Agent" in Copilot chat, with PowerPoint to follow. Microsoft last week said it would bring Anthropic's models to Microsoft 365 Copilot to diversify beyond longtime partner OpenAI. Anthropic, founded by former OpenAI executives, has positioned Claude for workplace use with guardrails it says reduce risky outputs. The company has been marketing Claude's coding and data-analysis skills to regulated industries and teams that want models to work across multiple software tools. Krieger said the company's focus is on sustained, reliable performance over long tasks rather than short demos.
[19]
Anthropic Launches Claude Sonnet 4.5 and Introduces Claude Agent SDK | PYMNTS.com
By completing this form, you agree to receive marketing communications from PYMNTS and to the sharing of your information with our sponsor, if applicable, in accordance with our Privacy Policy and Terms and Conditions. "Claude Sonnet 4.5 is state-of-the-art on the SWE-bench Verified evaluation, which measures complex real-world software coding abilities," the company said Monday (Sept. 29) in an announcement. "Practically speaking, we've observed it maintaining focus for more than 30 hours on complex, multi-step tasks." Anthropic added in the post that Sonnet 4.5 leads a benchmark that tests AI models on real-world computer tasks, OSWorld, at 61.4%. Together with the release of Sonnet 4.5, Anthropic has released upgrades to its products, according to the post. These include the addition of checkpoints to Claude Code, enabling users to save their progress and roll back to a previous state; the addition of a new context editing feature and memory tool to the Claude API, letting agents run longer and handle greater complexity; and the addition of code execution and file creation directly into the conversation in Claude apps, per the post. Anthropic also introduced Claude Agent SDK, which gives developers the ability to build AI agents with the same infrastructure that powers its frontier products, the post said. In addition, the Claude for Chrome extension is now available to Max users who joined the waitlist last month, according to the post. "We recommend upgrading to Claude Sonnet 4.5 for all uses," Anthropic said in the post. "Whether you're using Claude through our apps, our API, or Claude Code, Sonnet 4.5 is a drop-in replacement that provides much improved performance for the same price." When Anthropic launched a feature preview on Sept. 9 that allows users of Claude to create and edit files directly within Claude.ai and the desktop app, PYMNTS reported that the move positions the company to more directly compete against OpenAI's ChatGPT Enterprise, Microsoft Copilot and Google Gemini, which are all pitching AI-driven productivity tools to corporate users.
[20]
Anthropic launches Claude Sonnet 4.5, claims world's best coding model By Investing.com
Investing.com -- Anthropic has released Claude Sonnet 4.5, which the company describes as "the best coding model in the world" with enhanced capabilities for building complex agents and using computers. The new model shows substantial improvements in reasoning and math compared to previous versions, according to Anthropic. The company reports that Claude Sonnet 4.5 leads on the SWE-bench Verified evaluation, which measures real-world software coding abilities, and has achieved a 61.4% score on OSWorld, a benchmark for AI models performing real-world computer tasks. Alongside the model release, Anthropic has introduced several product upgrades. Claude Code now features checkpoints that allow users to save progress and roll back to previous states. The terminal interface has been refreshed, and a new VS Code extension brings Claude directly to the integrated development environment. For developers using the Claude API, Anthropic has added context editing to automatically clear stale context and a memory tool that stores information outside the context window, helping to manage long-running tasks without hitting context limits. The company has also made the Claude for Chrome extension available to users who joined the waitlist last month, and introduced the Claude Agent SDK, which provides developers with the infrastructure used to build Claude Code. Claude Sonnet 4.5 is available on the Claude Developer Platform, Amazon Bedrock, and Google Cloud's Vertex AI, with pricing remaining the same as Sonnet 4 at $3/$15 per million tokens. Anthropic claims this is their "most aligned frontier model yet," with improvements in reducing behaviors like sycophancy, deception, and power-seeking. The model is being released under Anthropic's AI Safety Level 3 protections, which include filters to detect potentially dangerous inputs and outputs. As a temporary feature, Anthropic is offering "Imagine with Claude," a five-day research preview available to Max subscribers that demonstrates the model generating software in real time.
Share
Share
Copy Link
Anthropic launches Claude Sonnet 4.5, claiming it to be the world's best model for coding and building complex AI agents. The new model boasts significant improvements in performance, alignment, and real-world applications.
Anthropic has unveiled its latest AI model, Claude Sonnet 4.5, positioning it as a game-changer in the realm of artificial intelligence, particularly for coding and autonomous agents. The company boldly claims that Sonnet 4.5 is "the best coding model in the world" and leads the market in computer use capabilities
1
2
.Source: engadget
Claude Sonnet 4.5 demonstrates remarkable improvements in coding performance. During early trials with enterprise customers, the model showcased its ability to code autonomously for up to 30 hours, not only building applications but also setting up database services, purchasing domain names, and even performing security audits
1
. This level of sustained, complex task execution represents a significant leap forward in AI capabilities.Source: Axios
The model's prowess is further evidenced by its performance on industry-standard benchmarks. Sonnet 4.5 outperformed its predecessors and competing models on the SWE-bench Verified evaluation, a human-filtered subset of the software engineering benchmark
4
. Its score of 61.4% on the OSWorld benchmark, which tests AI models on real-world computer tasks, marks a substantial improvement from the 42.2% achieved by Sonnet 4 just four months prior2
.Anthropic has introduced several new features to streamline the coding process:
4
5
.4
5
.1
4
.These enhancements are designed to make Claude Sonnet 4.5 a more versatile and powerful tool for developers across various industries.
Source: Mashable
Anthropic emphasizes that Claude Sonnet 4.5 is their "most aligned frontier model" to date
1
2
. This alignment focuses on reducing undesirable behaviors such as sycophancy, deception, and power-seeking tendencies. The company claims to have made significant progress in defending against prompt injection attacks, enhancing the model's safety and reliability2
5
.Related Stories
Claude Sonnet 4.5 is now available via the Claude API and the Claude.ai chatbot. Pricing for developers remains unchanged from the previous version: $3 per million input tokens and $15 per million output tokens
1
5
. This pricing strategy aims to maintain accessibility for developers while offering enhanced capabilities.The release of Claude Sonnet 4.5 intensifies the competition in the AI industry, particularly in the coding and enterprise application space. With companies like Apple and Meta reportedly using Claude AI models internally, Anthropic is positioning itself as a strong contender against rivals such as OpenAI and Google
1
5
.As AI models continue to evolve rapidly, Claude Sonnet 4.5 represents a significant step forward in autonomous coding and agent capabilities. Its ability to handle complex, long-duration tasks could potentially revolutionize software development processes and accelerate AI adoption across various sectors.
Summarized by
Navi
06 Aug 2025β’Technology
23 May 2025β’Technology
25 Feb 2025β’Technology