2 Sources
[1]
Fake AI Agent Skill Passed Security Scans and Reportedly Reached 26,000 Agents
Security firm AIR built a fake AI agent skill, pushed it through a popular skill marketplace and an Instagram ad, and says it reached roughly 26,000 agents, including some on corporate accounts. Every skill security scanner the firm tested it against marked it safe. The payload was harmless by design: it collected the user's email address and did nothing else. The point was to show that none of the signals people lean on to trust a skill caught it: not the scanners, not the GitHub stars, not the open-source reputation. A skill is a bundle of instructions an agent loads into its own context and follows with roughly the authority of a user prompt. That trust is the whole problem, and it is the reason skill-scanning tools exist in the first place. The skill, named brand-landingpage, claimed to build a landing page using Google's Stitch design tool, aimed squarely at non-technical users. To make it look credible, AIR went after two trust signals: GitHub stars and a clean scanner verdict. For the stars, it opened a pull request to a skill marketplace repository with around 36,000 stars and 156 skills. The pull request was merged after a few days, so the skill inherited the repo's count. Then it ran an Instagram ad aimed at marketers, salespeople, and designers, who installed it and put it to work. Why the scanners missed it The scanners AIR tested analyze the package you hand them: the SKILL.md and the files shipped with it. That's Cisco's, NVIDIA's, and the ones wired into skills.sh. AIR's skill carried no setup instructions of its own. It told the agent to install the "Stitch SDK" by following the documentation at an external link, stitch-design.ai, a domain AIR controls, not Google (the real Stitch lives at stitch.withgoogle.com). At first, the link led to the genuine Stitch docs, so the scanners, seeing a clean package that pointed at a plausible setup page, cleared it. The page the agent would actually fetch and follow sat outside the scan. Once the skill was installed widely, AIR swapped the page behind that link. The new version told the agent to download and run a script. In the demo, it only mailed the user's address back to AIR, which is how the firm counted the agents it reached. A real operator could have used that foothold to read files, move data, or hit internal systems, bounded only by what the agent could reach. AIR is not the first to show this. Three weeks earlier, Trail of Bits bypassed ClawHub's malicious-skill detector, Cisco's scanner, and all three scanners wired into skills.sh. Its conclusion was blunt: a scanner checks a fixed package, while an attacker can keep tweaking the payload until it passes. Real campaigns have used the same trick for months, keeping the submitted skill clean and hosting the payload on a site the agent only fetches at install. The problem is structural: the scan happens once, but the page a skill points the agent to can be rewritten at any time after. Anthropic's own docs already warn that skills fetching external URLs are risky for exactly this reason, since the content can change after the skill is vetted. Separate research this year found scanners often disagree, because each one judges a skill in isolation, blind to its external links and to what changes after review. What to do The read for defenders is the same one researchers keep landing on, now with a sharper example behind it. Treat skills as software, not text. Vet what a skill points to, not just what ships inside it. Most of these add-ons got installed with no review, so the first job is finding what is already running. Route new skills through a single source you control, and re-check them when anything changes, because a clean result at install does not stay clean if the skill phones out to a link someone else can edit. Pin versions. Hold agents to the least privilege. Assume any external instruction an agent fetches runs with the agent's access. The scale figures come from AIR alone, and they deserve a skeptical read. The firm is launching a managed skill marketplace and closes the write-up, pitching it, so the 26,000 number, the corporate-account detail, and the claim that it could have seized full control of every agent are the company's own and are not independently confirmed. What holds up is the method. The named scanners really do judge only the submitted package, the external-link blind spot is real and has been independently demonstrated, and the trust signals AIR borrowed, stars, and a clean scan are exactly the ones the ecosystem still treats as proof. The experiment does not expose a new bug so much as it lines up every weak trust signal around agent skills into one run: stars that can be borrowed, a scan that reads a snapshot, and a link that can be rewritten after the check clears. Whether the real figure is 26,000 or a fraction of it, the gap it walks through is one that defenders still have not closed.
[2]
A fake AI agent skill passed every security scanner and reportedly reached 26,000 agents
Security firm AIR got a fake skill past every major scanner and says it reached 26,000 agents by swapping an external URL after the scan cleared. Security firm AIR built a fake AI agent skill, pushed it through a popular skill marketplace and promoted it with an Instagram ad, and says it reached roughly 26,000 agents, including some on corporate accounts. Every skill security scanner the firm tested it against marked it safe. The payload was harmless by design, collecting only the user's email address, but AIR says a real attacker could have used the same foothold to read files, move data, or hit internal systems. The skill, called brand-landingpage, claimed to build a landing page using Google's Stitch design tool and was aimed at non-technical users. To make it look credible, AIR went after two trust signals that the ecosystem still treats as proof of safety: GitHub stars and a clean scanner verdict. For the stars, it opened a pull request to a skill marketplace repository with around 36,000 stars and 156 skills. The pull request was merged after a few days, so the skill inherited the repository's star count. Then AIR ran an Instagram ad targeting marketers, salespeople, and designers, who installed it and put it to work. The scanners AIR tested analyse the package you hand them, meaning the skill definition file and anything shipped with it. That includes tools from Cisco, NVIDIA, and the ones built into the major skill registries. AIR's skill carried no malicious setup instructions of its own but told the agent to install the "Stitch SDK" by following documentation at an external link it controlled, not the genuine Google domain. At first, the link led to the real Stitch documentation, so the scanners saw a clean package pointing at a plausible setup page and cleared it. The page the agent would actually fetch and follow sat outside the scan. Once the skill was installed widely, AIR swapped the page behind that link to one that told the agent to download and run a script. The technique is not new. Three weeks before AIR published its results, Trail of Bits bypassed ClawHub's malicious-skill detector, Cisco's scanner, and all three scanners built into the major skill registries. Its conclusion was that a scanner checks a fixed package while an attacker can keep tweaking the payload until it passes. Real campaigns have used the same trick for months, keeping the submitted skill clean and hosting the payload on a site the agent only fetches at install time. The problem is structural. The scan happens once, but the page a skill points the agent to can be rewritten at any time afterward. Anthropic's own documentation warns that skills fetching external URLs are risky for exactly this reason, since the content can change after the skill is vetted. Separate research this year found that seven major scanners agree on fewer than one in five hundred of their combined flags, because each one judges a skill in isolation, blind to external links and to what changes after review. The scale figures come from AIR alone and deserve a sceptical read. The firm is launching a managed skill marketplace and closes its write-up pitching it, so the 26,000 number, the corporate-account detail, and the claim that it could have seized full control of every agent are not independently confirmed. What holds up is the method: the named scanners really do judge only the submitted package, the external-link blind spot is real and has been independently demonstrated, and the trust signals AIR borrowed, stars and a clean scan, are exactly the ones the ecosystem still treats as proof. The experiment lines up every weak trust signal around agent skills into one run: stars that can be borrowed, a scan that reads a snapshot, and a link that can be rewritten after the check clears. Whether the real figure is 26,000 or a fraction of it, the gap it walks through is one that defenders still have not closed. For security teams, the immediate takeaway is the same one researchers keep landing on: treat skills as software, not text, and vet what a skill points to, not just what ships inside it. Route new skills through a single source you control, re-check them when anything changes, pin versions, and hold agents to the least privilege.
Share
Copy Link
Security firm AIR created a fake AI agent skill that passed every major security scanner and reportedly reached 26,000 agents, including corporate accounts. The experiment exposed critical flaws in how AI agent skills are vetted, showing that scanners only check submitted packages while attackers can modify external links after approval. The demonstration highlights urgent security gaps in the rapidly growing AI agent ecosystem.
AIR security firm built a fake AI agent skill that bypassed multiple security scanners and reportedly reached approximately 26,000 agents, including some on corporate accounts
1
. Every skill security scanner the firm tested marked it safe, exposing a security vulnerability that threatens the entire AI agent ecosystem. The payload was harmless by design, collecting only user email addresses, but the experiment demonstrated how real attackers could exploit the same method to read files, move data, or access internal systems.The fake AI skill, named brand-landingpage, claimed to build landing pages using Google's Stitch design tool and targeted non-technical users like marketers, salespeople, and designers
2
. A skill functions as a bundle of instructions an agent loads into its own context and follows with roughly the authority of a user prompt, making the trust model particularly dangerous when compromised.
Source: Hacker News
To make the malicious payload credible, AIR exploited two trust signals: GitHub stars and clean scanner verdicts. The firm opened a pull request to a skill marketplace repository with around 36,000 stars and 156 skills
1
. After the pull request was merged within a few days, the brand-landingpage skill inherited the repository's star count, instantly appearing trustworthy.The scanners AIR tested—including tools from Cisco, NVIDIA, and those built into skills.sh—analyze only the package submitted to them: the SKILL.md file and shipped components. AIR's skill carried no malicious setup instructions in the submitted package. Instead, it instructed agents to install the "Stitch SDK" by following documentation at stitch-design.ai, a domain AIR controls, not the genuine Google domain at stitch.withgoogle.com.
Initially, the external links pointed to legitimate Stitch documentation, so security scanners saw a clean package and cleared it
2
. The page agents would actually fetch sat outside the scan's scope. Once the fake AI skill was installed widely, AIR swapped the page behind that link to one instructing agents to download and run a script. This structural vulnerability in AI skill marketplaces means scans happen once, but pages a skill references can be rewritten anytime afterward.Anthropically's own documentation already warns that skills fetching external URLs carry risk for exactly this reason, since content can change after vetting. Separate research this year found that seven major scanners agree on fewer than one in five hundred of their combined flags, because each judges skills in isolation, blind to external links and post-review changes
2
.Related Stories
AIR is not the first to demonstrate this vulnerability in AI agent skills. Three weeks earlier, Trail of Bits bypassed ClawHub's malicious-skill detector, Cisco's scanner, and all three scanners built into major skill registries. Trail of Bits concluded that scanners check fixed packages while attackers can continuously tweak payloads until they pass. Real campaigns have used this technique for months, keeping submitted skills clean while hosting malicious payloads on sites agents only fetch at install time
2
.The scale figures come from AIR security firm alone and warrant scrutiny. The company is launching a managed skill marketplace and closes its write-up pitching the service, so the 26,000 number, corporate accounts detail, and claims of potential full agent control remain unconfirmed. However, the method holds up: named scanners genuinely judge only submitted packages, the external-link blind spot is real and independently demonstrated, and the trust signals AIR exploited—GitHub stars and clean scans—are exactly what the ecosystem treats as proof of safety.
Defenders must treat skills as software, not text, and vet what skills point to, not just what ships inside them
2
. Organizations should route new skills through a single controlled source, re-check them when anything changes, pin versions, and implement least-privilege access for AI agent security. Most concerning: many of these add-ons get installed with no review, so the first task is discovering what's already running. A clean result at install doesn't stay clean if the skill references external links someone else can edit.Summarized by
Navi
08 Mar 2026•Technology

29 May 2026•Technology

23 Dec 2025•Technology

1
Policy and Regulation

2
Policy and Regulation

3
Technology
