Agentic AI finds 10,000+ vulnerabilities as security gaps widen and adversaries close in

10 Sources

Share

Frontier AI models like Claude Mythos are discovering thousands of security flaws across major systems, but the same capabilities are being distilled by adversaries. Meanwhile, widely-adopted AI agents like OpenClaw are operating with broad authority across enterprise systems, often without IT oversight. As 74% of companies plan to deploy agentic AI within two years, only 21% have mature governance models in place.

Frontier Models Uncover Massive Security Flaws

AI security has entered a critical phase as frontier models demonstrate unprecedented capability to identify vulnerabilities at scale. Anthropic's Claude Mythos Preview has surfaced more more than 10,000 high-severity vulnerabilities through Project Glasswing, scanning every major operating system and web browser

2

. Some of these security flaws had survived decades of human review. The program has expanded to approximately 150 organizations across more than 15 countries, including Samsung, SK Hynex, NATO, and the EU's cybersecurity agency ENISA. The bottleneck has already shifted from finding vulnerabilities to patching them fast enough, but this defensive success story carries a darker implication: the same class of model is accessible to adversaries who have no interest in patching anything.

Source: TechRadar

Source: TechRadar

China's Industrial-Scale Distillation Campaign

While frontier AI models strengthen defense capabilities, adversaries are rapidly closing the gap through systematic extraction efforts. The White House released a policy memorandum in April accusing China of conducting deliberate, industrial-scale campaigns to extract frontier AI capabilities from American labs

2

. Distillation doesn't require stealing model weights. Instead, attackers feed thousands of carefully constructed queries to a frontier model, collect the responses, and use them to train a cheaper rival that approximates the original at a fraction of the cost. Anthropic identified approximately 24,000 fraudulent accounts and 16 million total exchanges across three Chinese laboratories: DeepSeek conducted more than 150,000 exchanges with Claude focused on foundational logic and alignment techniques, MiniMax generated over 13 million exchanges, and Moonshot AI produced more than 3.4 million targeting agentic reasoning, coding, and computer vision. By early April, OpenAI, Anthropic, and Google had begun sharing distillation threat intelligence through the Frontier Model Forum.

First Weaponized Zero-Day Exploit Deployed

In May, Google's Threat Intelligence Group confirmed the first known case of an AI system discovering and weaponizing a zero-day exploit that was then deployed in the wild

2

. A criminal actor used a frontier model to find a two-factor authentication bypass, build a working exploit, and use it before any defender knew the vulnerability existed. That single incident compressed what used to take skilled hackers weeks into a process measured in hours. It represents the clearest illustration yet of the dual-use problem at the heart of frontier AI: the same capabilities that help organizations defend themselves can generate an equivalent number of exploits in the wrong hands. Malicious actors are already using frontier-class models to develop exploits without waiting for anyone's permission.

Source: TechRadar

Source: TechRadar

OpenClaw Breach Exposes Agentic AI Security Gaps

AI adoption risks extend beyond frontier models to the autonomous agents now operating across enterprise systems. OpenClaw, a widely used AI agent, became a case study in AI agent security failures when researchers uncovered a vulnerability that allowed any website visited by a developer to silently take control of the local agent

4

. No extensions, plugins, or user action were required. The attack exploited OpenClaw's local WebSocket gateway, allowing malicious JavaScript to connect to the gateway, brute-force the password, and register as a trusted device. Once authenticated, the attacker could access configuration data, enumerate connected nodes, read logs, and execute commands across connected systems. Compromising a single AI agent could effectively compromise an entire workstation. OpenClaw maintainers issued a fix within 24 hours, but the vulnerability highlights a systemic risk.

Source: TechRadar

Source: TechRadar

Shadow AI Creates Ungoverned Attack Surface

Many AI agents are deployed without IT or security awareness, creating what experts call shadow AI

4

. These autonomous systems exist on developer machines, storing credentials, connecting to messaging platforms, and executing actions independently. According to Deloitte, 74% of companies plan to deploy agentic AI within two years, while only 21% have a mature governance model in place. PwC finds that 79% of organizations have already deployed AI agents at some level. This gap between deployment and governance is precisely what makes agents like OpenClaw so dangerous. They operate with broad authority, holding credentials, executing commands, and connecting across systems without oversight. AI expanding the attack surface means security teams now face vulnerabilities such as prompt injection, data leakage, and adversarial inputs that can manipulate models in ways that bypass conventional security controls.

Defense Networks Require Secure IT Infrastructure

Securing AI systems in classified environments presents additional complexity. As the U.S. government moves to deploy AI capabilities on classified networks following the White House's June 2 Executive Order on Advanced AI and Security, the opportunity is clear: advanced AI can help accelerate decision superiority for American forces

1

. But the risks are expanding just as quickly, particularly as agentic AI begins to operate across sensitive networks, data environments, and mission workflows. AI is only as trustworthy as the data it uses, the networks it touches, and the controls that determine who and what can access it. In classified environments, that challenge is compounded by the need to move information securely across classification levels, compartments, coalition boundaries, and operational environments. Anthropic has embedded approximately six engineers inside the NSA to adapt Mythos for operational applications, according to reporting from TechTimes, with the model potentially used for offensive cyber operations targeting networks in countries including China and Iran.

Continuous Testing and Adaptive Security Models

Mitigating AI risks requires fundamentally different approaches than traditional security methods. Static checks, periodic penetration tests, and basic vulnerability scans were not built for this pace of change

3

. They can miss issues that only appear when applications are tested dynamically, in context, and from an attacker's perspective. Platforms like XBOW are enabling continuous pentesting and autonomous offensive security to help teams find, validate, and prioritize exploitable vulnerabilities before attackers do. Agentic testing uses AI itself to simulate sophisticated, real-world attacks both persistently and realistically. These autonomous agents don't follow a script but learn from the system's response, adapting their tactics and relentlessly probing for weak points. An agentic test might use a series of subtly crafted, conversational prompts to trick a Large Language Model into revealing sensitive data or ignoring its built-in safety rules.

Simplifying Security Through Better Access Control

Complexity itself has become a primary security vulnerability. Agent permissions illustrate where this plays out in AI systems

5

. Employees accumulate numerous permissions over time, and while humans know which access is relevant to a task, agents lack that judgment. An agent assigned to a problem will probe every available path, creating a potential attack surface far larger than the task required. What's needed is a permissioning model built around intent, where the agent has only the credentials it needs for a specific task, and they expire when it's done. Standards like OAuth are evolving to support agentic AI, allowing agents to carry identities scoped to a specific task rather than a user's full permission set. Organizations must gain visibility by inventorying AI agents, autonomous assistants, and local LLM servers across developer environments. They should scope access carefully, audit privileges, and enforce least privilege wherever possible. Treating agents as non-human identities with rigorous governance becomes essential.

Policy Response Falls Short of the Challenge

On June 2, the Trump administration signed an executive order asking AI companies to voluntarily submit frontier models for government cybersecurity testing up to 30 days before public release

2

. The order was originally drafted with a 90-day window, but the White House pulled it in May over concerns it would blunt US competitiveness against China, then cut the period to 30 days in the final version. The word "voluntarily" is the operative constraint: no company is legally required to participate, and the order gives the government no power to block a release. The structural problem is clear. US frontier models are the best in the world at finding software vulnerabilities, but Chinese labs are distilling those same capabilities, months behind but closing. If the US restricts access to protect the models, it slows diffusion of defensive tools to allies. If it doesn't restrict access, it accelerates the transfer of offensive capabilities to adversaries. Anthropic says it does not plan to make Mythos generally available until cybersecurity safeguards can detect and block the model's most dangerous outputs, but those safeguards don't yet exist.

Today's Top Stories

© 2026 TheOutpost.AI All rights reserved