Fake AI agent skill bypassed every security scanner and reportedly reached 26,000 agents

2 Sources

Share

Security firm AIR created a fake AI agent skill that passed every major security scanner and reportedly reached 26,000 agents, including corporate accounts. The experiment exposed critical flaws in how AI agent skills are vetted, showing that scanners only check submitted packages while attackers can modify external links after approval. The demonstration highlights urgent security gaps in the rapidly growing AI agent ecosystem.

Security Firm Exposes Critical Gap in AI Agent Security

AIR security firm built a fake AI agent skill that bypassed multiple security scanners and reportedly reached approximately 26,000 agents, including some on corporate accounts

1

. Every skill security scanner the firm tested marked it safe, exposing a security vulnerability that threatens the entire AI agent ecosystem. The payload was harmless by design, collecting only user email addresses, but the experiment demonstrated how real attackers could exploit the same method to read files, move data, or access internal systems.

The fake AI skill, named brand-landingpage, claimed to build landing pages using Google's Stitch design tool and targeted non-technical users like marketers, salespeople, and designers

2

. A skill functions as a bundle of instructions an agent loads into its own context and follows with roughly the authority of a user prompt, making the trust model particularly dangerous when compromised.

Source: Hacker News

Source: Hacker News

How the Fake AI Agent Skill Passed Security Scanners

To make the malicious payload credible, AIR exploited two trust signals: GitHub stars and clean scanner verdicts. The firm opened a pull request to a skill marketplace repository with around 36,000 stars and 156 skills

1

. After the pull request was merged within a few days, the brand-landingpage skill inherited the repository's star count, instantly appearing trustworthy.

The scanners AIR tested—including tools from Cisco, NVIDIA, and those built into skills.sh—analyze only the package submitted to them: the SKILL.md file and shipped components. AIR's skill carried no malicious setup instructions in the submitted package. Instead, it instructed agents to install the "Stitch SDK" by following documentation at stitch-design.ai, a domain AIR controls, not the genuine Google domain at stitch.withgoogle.com.

External Links Enable Post-Approval Payload Swaps

Initially, the external links pointed to legitimate Stitch documentation, so security scanners saw a clean package and cleared it

2

. The page agents would actually fetch sat outside the scan's scope. Once the fake AI skill was installed widely, AIR swapped the page behind that link to one instructing agents to download and run a script. This structural vulnerability in AI skill marketplaces means scans happen once, but pages a skill references can be rewritten anytime afterward.

Anthropically's own documentation already warns that skills fetching external URLs carry risk for exactly this reason, since content can change after vetting. Separate research this year found that seven major scanners agree on fewer than one in five hundred of their combined flags, because each judges skills in isolation, blind to external links and post-review changes

2

.

Pattern Mirrors Previous Research and Real Attacks

AIR is not the first to demonstrate this vulnerability in AI agent skills. Three weeks earlier, Trail of Bits bypassed ClawHub's malicious-skill detector, Cisco's scanner, and all three scanners built into major skill registries. Trail of Bits concluded that scanners check fixed packages while attackers can continuously tweak payloads until they pass. Real campaigns have used this technique for months, keeping submitted skills clean while hosting malicious payloads on sites agents only fetch at install time

2

.

What Security Teams Need to Know

The scale figures come from AIR security firm alone and warrant scrutiny. The company is launching a managed skill marketplace and closes its write-up pitching the service, so the 26,000 number, corporate accounts detail, and claims of potential full agent control remain unconfirmed. However, the method holds up: named scanners genuinely judge only submitted packages, the external-link blind spot is real and independently demonstrated, and the trust signals AIR exploited—GitHub stars and clean scans—are exactly what the ecosystem treats as proof of safety.

Defenders must treat skills as software, not text, and vet what skills point to, not just what ships inside them

2

. Organizations should route new skills through a single controlled source, re-check them when anything changes, pin versions, and implement least-privilege access for AI agent security. Most concerning: many of these add-ons get installed with no review, so the first task is discovering what's already running. A clean result at install doesn't stay clean if the skill references external links someone else can edit.

Today's Top Stories

© 2026 TheOutpost.AI All rights reserved