Fake AI Agent Skill Bypassed Security Scanners

Security Firm Exposes Critical Gap in AI Agent Security

AIR security firm built a fake AI agent skill that bypassed multiple security scanners and reportedly reached approximately 26,000 agents, including some on corporate accounts1

. Every skill security scanner the firm tested marked it safe, exposing a security vulnerability that threatens the entire AI agent ecosystem. The payload was harmless by design, collecting only user email addresses, but the experiment demonstrated how real attackers could exploit the same method to read files, move data, or access internal systems.

The fake AI skill, named brand-landingpage, claimed to build landing pages using Google's Stitch design tool and targeted non-technical users like marketers, salespeople, and designers2

. A skill functions as a bundle of instructions an agent loads into its own context and follows with roughly the authority of a user prompt, making the trust model particularly dangerous when compromised.

Source: Hacker News

How the Fake AI Agent Skill Passed Security Scanners

To make the malicious payload credible, AIR exploited two trust signals: GitHub stars and clean scanner verdicts. The firm opened a pull request to a skill marketplace repository with around 36,000 stars and 156 skills1

. After the pull request was merged within a few days, the brand-landingpage skill inherited the repository's star count, instantly appearing trustworthy.

The scanners AIR tested—including tools from Cisco, NVIDIA, and those built into skills.sh—analyze only the package submitted to them: the SKILL.md file and shipped components. AIR's skill carried no malicious setup instructions in the submitted package. Instead, it instructed agents to install the "Stitch SDK" by following documentation at stitch-design.ai, a domain AIR controls, not the genuine Google domain at stitch.withgoogle.com.

External Links Enable Post-Approval Payload Swaps

Initially, the external links pointed to legitimate Stitch documentation, so security scanners saw a clean package and cleared it2

. The page agents would actually fetch sat outside the scan's scope. Once the fake AI skill was installed widely, AIR swapped the page behind that link to one instructing agents to download and run a script. This structural vulnerability in AI skill marketplaces means scans happen once, but pages a skill references can be rewritten anytime afterward.

Anthropically's own documentation already warns that skills fetching external URLs carry risk for exactly this reason, since content can change after vetting. Separate research this year found that seven major scanners agree on fewer than one in five hundred of their combined flags, because each judges skills in isolation, blind to external links and post-review changes2

Pattern Mirrors Previous Research and Real Attacks

AIR is not the first to demonstrate this vulnerability in AI agent skills. Three weeks earlier, Trail of Bits bypassed ClawHub's malicious-skill detector, Cisco's scanner, and all three scanners built into major skill registries. Trail of Bits concluded that scanners check fixed packages while attackers can continuously tweak payloads until they pass. Real campaigns have used this technique for months, keeping submitted skills clean while hosting malicious payloads on sites agents only fetch at install time2

What Security Teams Need to Know

The scale figures come from AIR security firm alone and warrant scrutiny. The company is launching a managed skill marketplace and closes its write-up pitching the service, so the 26,000 number, corporate accounts detail, and claims of potential full agent control remain unconfirmed. However, the method holds up: named scanners genuinely judge only submitted packages, the external-link blind spot is real and independently demonstrated, and the trust signals AIR exploited—GitHub stars and clean scans—are exactly what the ecosystem treats as proof of safety.

Defenders must treat skills as software, not text, and vet what skills point to, not just what ships inside them2

. Organizations should route new skills through a single controlled source, re-check them when anything changes, pin versions, and implement least-privilege access for AI agent security. Most concerning: many of these add-ons get installed with no review, so the first task is discovering what's already running. A clean result at install doesn't stay clean if the skill references external links someone else can edit.

Fake AI agent skill bypassed every security scanner and reportedly reached 26,000 agents

Security Firm Exposes Critical Gap in AI Agent Security

How the Fake AI Agent Skill Passed Security Scanners

External Links Enable Post-Approval Payload Swaps

Pattern Mirrors Previous Research and Real Attacks

What Security Teams Need to Know

References

Fake AI Agent Skill Passed Security Scans and Reportedly Reached 26,000 Agents

A fake AI agent skill passed every security scanner and reportedly reached 26,000 agents

Related Stories

Cybercriminals deploy AI agents to automate attacks as exploitation windows collapse to days

Meta AI hack and ChatGPT flaws expose critical AI security gaps through prompt injection

GhostApproval vulnerability exposes AI coding agents to remote code execution via symlinks

Recent Highlights

Xi Jinping positions China as global AI partner while challenging US tech dominance

Apple releases Siri AI to everyone through iOS 27 public beta, marking biggest assistant overhaul

Moonshot AI's Kimi K3 rivals Claude and ChatGPT, shaking up the US tech industry

Recent Highlights

Today's Top Stories

AI advice makes people three times less accurate but twice as confident, new study reveals

Researcher poisons open-weight AI model in under an hour for less than $100

China is rebuilding the smartphone around AI agents. ZTE's NaviX sold out in hours.

BrainCo unveils brain-to-robot platform to control humanoid robots with thoughts at WAIC