Agentjacking attack tricks AI coding agents into executing malicious code via fake bug reports

2 Sources

Share

Cybersecurity researchers at Tenet Security have uncovered Agentjacking, a novel attack method that hijacks AI coding agents through fake Sentry error reports. The attack exploits Model Context Protocol to execute malicious code on developer machines with an 85% success rate, bypassing traditional security measures like EDR and firewalls. At least 2,388 organizations face exposure through vulnerable Sentry DSNs.

Tenet Security Uncovers Critical Flaw in AI Coding Agents

Cybersecurity researchers at Tenet Security have disclosed Agentjacking, a novel attack method that hijacks AI coding agents and tricks them into running malicious code on developer machines

1

. The attack requires no malware, stolen passwords, or infrastructure breaches, instead weaponizing the trust relationship between developers and their AI assistants. Security researchers Ron Bobrov, Barak Sternberg, and Nevo Poran demonstrated how a fake bug report can transform coding agents like Claude Code and Cursor into unwitting accomplices in code execution attacks

2

.

Source: Hacker News

Source: Hacker News

How Agentjacking Exploits Model Context Protocol and Sentry

The attack exploits Model Context Protocol (MCP), the standard that allows AI coding agents to connect with external services. At its core lies a critical architectural flaw in how the Sentry error-tracking system handles data ingestion. Sentry accepts arbitrary payloads from anyone possessing a Data Source Name (DSN), a public, write-only credential embedded in websites by design. Attackers send malicious error events to Sentry's ingest endpoint via POST requests, embedding carefully formatted markdown in message fields and context key names. When AI agents query Sentry through MCP, they receive these injected events and interpret them as legitimate diagnostic guidance, unable to distinguish between genuine application crashes and attacker-planted instructions

2

.

Execute Malicious Code Through Trusted Developer Workflows

When developers ask their AI coding agents to "fix unresolved Sentry issues," the agents process what appears to be structured system output but actually contains malicious commands disguised as resolution steps. The agents then execute malicious code with full developer privileges, accessing sensitive data including environment variables, Git credentials, AWS keys, GitHub tokens, and private repository URLs

1

. "The attacker never touches the victim's infrastructure," the researchers explained. "The malicious instruction arrives disguised as a legitimate 'Resolution' inside an ordinary error"

1

. This creates pathways to CI/CD pipelines and cloud infrastructure, amplifying the potential damage far beyond initial access.

Widespread Exposure Across Organizations

Tenet Security identified at least 2,388 organizations exposed through vulnerable Sentry DSNs, ranging from $250 billion enterprises to solo developers, including cloud-security vendors

2

. In controlled testing against over 100 organizations, researchers achieved an 85% exploitation success rate across widely used AI coding assistants

1

. The attack surface extends beyond Sentry to any external data source that agents treat as trusted, including support tickets, GitHub issues, and documentation systems.

Traditional Security Measures Fail to Detect Agentjacking

The attack bypasses EDR, WAF, IAM, VPN, Cloudflare, and firewalls because every action in the chain appears authorized

1

. Tenet calls this the "Authorised Intent Chain"—a sequence where nothing appears malicious to traditional security tools. Even prompt engineering offers no protection; agents executed the code even when explicitly instructed to ignore untrusted data

2

. This reveals a fundamental gap in how AI agents evaluate trust and make execution decisions.

Sentry Declines Root Fix, Highlights Industry Challenge

Sentry acknowledged the issue on June 3 but declined to implement a root-level fix, stating it's "technically not defensible." Instead, the company activated a global content filter blocking a specific payload string—a measure that addresses symptoms rather than underlying causes

1

. This response highlights a broader industry challenge: as enterprises rush to deploy AI coding agents, the agents themselves have become the attack surface. "The only place left to stop this is the moment the agent decides to act," Tenet warns

2

. Organizations must scrutinize how agents handle external data and implement controls at the decision-making layer, not just perimeter defenses.

Today's Top Stories

© 2026 TheOutpost.AI All rights reserved