2 Sources
2 Sources
[1]
Anthropic says its new Opus 4.6 platform found over 500 previously unknown high-severity security flaws in open-source libraries during testing
Focused on securing open-source software, with patches already landing; company urges rapid action while AI can still deliver at scale Anthropic has released Claude Opus 4.6, the latest version of its most powerful large language model (LLM), and claims it is "notably better" at finding high-severity vulnerabilities compared to previous models. In fact, Opus 4.6 has so far managed to find more than 500 such flaws. Anthropic said Opus 4.6 stood out with the way it found vulnerabilities "out of the box without task-specific tooling, custom scaffolding, or specialized prompting." It also added unlike fuzzing, which is a standard vulnerability hunting technique, Opus works by reasoning about the code "the way a human researcher would", meaning it was looking at past fixes to find similar bugs that weren't addressed, spotting patterns that tend to cause problems, and was understanding logic "well enough to know exactly what input would break it." Even using it on some of the most well-tested codebases, projects that had fuzzers running against them for years, Opus still managed to find high-severity flaws that were flying under the radar for "decades". Anthropic said it started using Claude to help fix flaws in open-source software, mostly because it runs "everywhere, from enterprise systems to critical infrastructure", and because vulnerabilities in this software are felt across the internet. Furthermore, much of the popular open-source software in use today is maintained by a small team of volunteers, meaning resources are more than limited. The results have been notable, the company says: "So far, we've found and validated more than 500 high-severity vulnerabilities. We've begun reporting them and are seeing our initial patches land, and we're continuing to work with maintainers to patch the others." It concludes that AI models can now find high-severity vulnerabilities at scale, but stressed that it might not be the case in the near future. "This is a moment to move quickly to empower defenders and secure as much code as possible while the window exists."
[2]
Anthropic's newest model excels at finding security vulnerabilities, but raises cybersecurity risks | Fortune
Frontier AI models are no longer merely helping engineers write code faster or automate routine tasks. They are increasingly capable of spotting their mistakes. Anthropic says its newest model, Claude Opus 4.6, excels at discovering the kinds of software weaknesses that underpin major cyberattacks. According to a report from the company's Frontier Red Team, during testing, Opus 4.6 identified over 500 previously unknown zero-day vulnerabilities -- flaws that are unknown to people who wrote the software, or the party responsible for patching or fixing it -- across open-source software libraries. Notably, the model was not explicitly told to search for the security flaws, but rather it detected and flagged the issues on its own. Anthropic says the "results show that language models can add real value on top of existing discovery tools," but acknowledged that the capabilities are also inherently "dual use." The same capabilities that help companies find and fix security flaws can just as easily be weaponized by attackers to discover and exploit the vulnerabilities before defenders can find them. An AI model that can autonomously identify zero-day exploits in widely used software could accelerate both sides of the cybersecurity arms race -- potentially tipping the advantage toward whoever acts fastest. Representatives for Antropic did not immediately respond to a request for comment on the cybersecurity risks. However, Logan Graham, head of Anthropic's frontier red team, told Axios that the company views cybersecurity as a competition between offense and defense, and wants to ensure defenders get access to these tools first. To manage some of the risk, Anthropic is deploying new detection systems that monitor Claude's internal activity as it generates responses, using what the company calls "probes" to flag potential misuse in real time. The company says it's also expanding its enforcement capabilities, including the ability to block traffic identified as malicious. Anthropic acknowledges this approach will create friction for legitimate security researchers and defensive work, and has committed to collaborating with the security community to address those challenges. The safeguards, the company says, represent "a meaningful step forward" in detecting and responding to misuse quickly, though the work is ongoing. OpenAI, in contrast, has taken a more cautious approach with its new coding model, GPT-5.3-Codex, also released on Thursday. The company has emphasized that while the model was a bump up in coding performance, serious cybersecurity risks come with those gains. OpenAI CEO Sam Altman said in a post on X that GPT-5.3-Codex is the first model to be rated "high" for cybersecurity risk under the company's internal preparedness framework. As a result, OpenAI is rolling out GPT-5.3-Codex with tighter controls. While the model is available to paid ChatGPT users for everyday development tasks, the company is delaying full API access and restricting high-risk use cases that could enable automation at scale. More sensitive applications are being gated behind additional safeguards, including a trusted-access program for vetted security professionals. OpenAI said in a blog post accompanying the launch that it does not yet have "definitive evidence" the model can fully automate cyberattacks but is taking a precautionary approach, deploying what it described as its most comprehensive cybersecurity safety stack to date, including enhanced monitoring, safety training, and enforcement mechanisms informed by threat intelligence.
Share
Share
Copy Link
Anthropic released Claude Opus 4.6, which discovered over 500 previously unknown high-severity security flaws in open-source libraries during testing. The large language model identified zero-day vulnerabilities without specialized prompting, using human-like reasoning to spot patterns in codebases. While the capability strengthens defenders, it also raises cybersecurity risks as the same technology could be weaponized by attackers.
Anthropic has launched Claude Opus 4.6, its most advanced large language model to date, with capabilities that extend far beyond routine coding assistance. During testing, the model identified over 500 previously unknown zero-day vulnerabilities across open-source software libraries, marking a significant leap in AI-powered security analysis
1
2
. What sets Claude Opus 4.6 apart is its ability to find high-severity security flaws "out of the box without task-specific tooling, custom scaffolding, or specialized prompting," according to Anthropic1
. The model wasn't explicitly instructed to hunt for vulnerabilities but detected and flagged issues autonomously during routine operations2
.
Source: Fortune
Unlike traditional fuzzers that have run against codebases for years, Claude Opus 4.6 employs human-like reasoning to analyze code. The model examines past fixes to identify similar bugs that weren't addressed, spots patterns that typically cause problems, and understands logic deeply enough to determine exactly what input would break it
1
. Even when applied to some of the most well-tested codebases, projects that had fuzzers running against them for years, the model still managed to uncover high-severity flaws that had been flying under the radar for "decades"1
. Anthropic's Frontier Red Team report confirms that "language models can add real value on top of existing discovery tools," demonstrating how frontier AI models are evolving beyond simple automation2
.Anthropic deliberately focused its efforts on open-source software because it runs "everywhere, from enterprise systems to critical infrastructure," meaning security vulnerabilities in this software ripple across the internet
1
. Much of today's popular open-source software is maintained by small teams of volunteers with severely limited resources, making AI assistance particularly valuable. The company reports that it has "found and validated more than 500 high-severity vulnerabilities" and has begun reporting them to maintainers, with initial patches already landing1
. Anthropic continues working with maintainers to address the remaining issues, emphasizing the urgency of patching flaws while AI can still deliver at scale.While Claude Opus 4.6's capabilities strengthen defenders, Anthropic acknowledges the technology is inherently "dual use"
2
. The same capabilities that help companies find and fix security vulnerabilities can be weaponized by attackers to discover and exploit flaws before defenders can respond. An LLM that can autonomously identify exploits in widely used software could accelerate both sides of the cybersecurity arms race, potentially tipping the advantage toward whoever acts fastest2
. Logan Graham, head of Anthropic's frontier red team, told Axios the company views cybersecurity as a competition between offense and defense, and wants to ensure defenders get access to these tools first2
.To manage misuse risks, Anthropic is deploying new detection systems that monitor Claude's internal activity as it generates responses, using "probes" to flag potential misuse in real time
2
. The company is also expanding its enforcement capabilities, including the ability to block traffic identified as malicious. Anthropic acknowledges this approach will create friction for legitimate security researchers and defensive work, and has committed to collaborating with the security community to address those challenges2
. The company describes these safeguards as "a meaningful step forward" in detecting and responding to misuse quickly, though the work remains ongoing.Related Stories
In contrast, OpenAI has adopted a more cautious stance with its new coding model, GPT-5.3-Codex, also released Thursday. OpenAI CEO Sam Altman announced that GPT-5.3-Codex is the first model to be rated "high" for cybersecurity risk under the company's internal preparedness framework
2
. While the model is available to paid ChatGPT users for everyday development tasks, OpenAI is delaying full API access and restricting high-risk use cases that could enable automation at scale. More sensitive applications are being gated behind additional safeguards, including a trusted-access program for vetted security professionals. OpenAI deployed what it described as its most comprehensive cybersecurity safety stack to date, including enhanced monitoring, safety training, and enforcement mechanisms informed by threat intelligence2
.Anthropic emphasizes that while AI models can now find high-severity vulnerabilities at scale, this advantage may not persist. "This is a moment to move quickly to empower defenders and secure as much code as possible while the window exists," the company warns
1
. The statement suggests that as these capabilities become more widely available, the current defensive advantage could evaporate, making immediate action critical for organizations relying on open-source software in their infrastructure.Summarized by
Navi
07 Aug 2025β’Technology

06 Aug 2025β’Technology

02 Dec 2025β’Technology

1
Technology

2
Policy and Regulation

3
Policy and Regulation
