7 Sources
7 Sources
[1]
How LiteLLM Turned Developer Machines Into Credential Vaults for Attackers
The most active piece of enterprise infrastructure in the company is the developer workstation. That laptop is where credentials are created, tested, cached, copied, and reused across services, bots, build tools, and now local AI agents. In March 2026, the TeamPCP threat actor proved just how valuable developer machines are. Their supply chain attack on LiteLLM, a popular AI development library downloaded millions of times daily, turned developer endpoints into systematic credential harvesting operations. The malware only needed access to the plaintext secrets already sitting on disk. The LiteLLM Attack: A Case Study in Developer Endpoint Compromise The attack was straightforward in execution but devastating in scope. TeamPCP compromised LiteLLM packages versions 1.82.7 and 1.82.8 on PyPI, injecting infostealer malware that activated when developers installed or updated the package. The malware systematically harvested SSH keys, cloud credentials for AWS, Azure, and GCP, Docker configurations, and other sensitive data from developer machines. PyPI removed the malicious packages within hours of detection, but the damage window was significant. GitGuardian's analysis found that 1,705 PyPI packages were configuredto automatically pull the compromised LiteLLM versions as dependencies. Popular packages like dspy (5 million monthly downloads), opik (3 million), and crawl4ai (1.4 million) would have triggered malware execution during installation. The cascade effect meant organizations that never directly used LiteLLM could still be compromised through transitive dependencies. Why Developer Machines Are Attractive Targets This attack pattern isn't new; it's just more visible. The Shai-Hulud campaigns demonstrated similar tactics at scale. When GitGuardian analyzed 6,943 compromised developer machines from that incident, researchers found 33,185 unique secrets, with at least 3,760 still valid. More striking: each live secret appeared in roughly eight different locations on the same machine, and 59% of compromised systems were CI/CD runners rather than personal laptops. Adversaries now slip into the toolchain through compromised dependencies, malicious plugins, or poisoned updates. Once there, they harvest local environment data with the same systematic approach security teams use to scan for vulnerabilities, except they're looking for credentials stored in .env files, shell profiles, terminal history, IDE settings, cached tokens, build artifacts, and AI agent memory stores. Secrets Live Everywhere in Plaintext The LiteLLM malware succeeded because developer machines are dense concentration points for plaintext credentials. Secrets end up in source trees, local config files, debug output, copied terminal commands, environment variables, and temporary scripts. They accumulate in .env files that were supposed to be local-only but became a permanent part of the codebase. Convenience turns into residue, which becomes opportunity. Developers are running agents, local MCP servers, CLI tools, IDE extensions, build pipelines, and retrieval workflows, all requiring credentials. Those credentials spread across predictable paths where malware knows to look: ~/.aws/credentials, ~/.config/gh/config.yml, project .env files, shell history, and agent configuration directories. Protecting Developer Endpoints at Scale It's important to build continuous protection across every developer endpoint where credentials accumulate.GitGuardian approaches this by extending secrets security beyond code repositories to the developer machine itself. The LiteLLM attack demonstrated what happens when credentials accumulate in plaintext across developer endpoints. Here's what you can do to reduce that exposure. Understand Your Exposure Start with visibility. Treat the workstation as the primary environment for secrets scanning, not an afterthought. Use ggshield to scan local repositories for credentials that slipped into code or linger in Git history. Scan filesystem paths where secrets accumulate outside Git: project workspaces, dotfiles, build output, and agent folders where local AI tools generate logs, caches, and "memory" stores. Don't assume environment variables are safe just because they're not in files. Shell profiles, IDE settings, and generated artifacts often persist environment values on disk indefinitely. Scan these locations the same way you scan repos. Add ggshield pre-commit hooks to stop creating new leaks in commits while cleaning up old ones. This turns secret detection into a default guardrail that catches mistakes before they become incidents. Move Secrets Into Vaults Detection without remediation is just noise. When a credential leaks, remediation typically requires coordination across multiple teams: security identifies the exposure, infrastructure owns the service, the original developer may have left the company, and product teams worry about production breaks. Without clear ownership and workflow automation, remediation becomes a manual process that gets deprioritized. The solution is treating secrets as managed identities with defined ownership, lifecycle policies, and automated remediation paths. Move credentials into a centralized vault infrastructure where security teams can enforce rotation schedules, access policies, and usage monitoring. Integrate incident management with your existing ticketing systems so remediation happens in context rather than requiring constant tool-switching. Treat AI Agents as Credential Risks Agentic tools can read files, run commands, and move data. With OpenClaw-style agents, "memory" is literally files on disk (SOUL.md, MEMORY.md) stored in predictable locations. Never paste credentials into agent chats, never teach agents secrets "for later," and routinely scan agent memory files as sensitive data stores. Eliminate Whole Classes of Secrets The fastest way to reduce secret sprawl is by removing the need for entire categories of shared secrets. On the human side, adopt WebAuthn (passkeys) to replace passwords. On the workload side, migrate to OIDC federation, so pipelines stop relying on stored cloud keys and service account secrets. Start with the highest-risk paths where leaked credentials hurt most, then expand. Move developer access to passkeys and migrate CI/CD workflows to OIDC-based auth. Use Ephemeral Credentials If you can't eliminate secrets yet, make them short-lived and automatically replaced. Use SPIFFE to issue cryptographic identity documents (SVIDs) that rotate automatically instead of relying on static API keys. Start with long-lived cloud keys, deployment tokens, and service credentials that developers keep locally for convenience. Shift to short-lived tokens, automatic rotation, and workload identity patterns. Each migration is one less durable secret that can be stolen and weaponized. The goal is to reduce the value an attacker can extract from any successful foothold on a developer machine. Honeytokens as early warning systems Honeytokens provide interim protection. Place decoy credentials in locations attackers systematically target: developer home directories, common configuration paths, and agent memory stores. When harvested and validated, these tokens generate immediate alerts, compressing detection time from "discovering damage weeks later" to "catching attacks while unfolding." This isn't the end state, but it changes the response window while systematic cleanup continues. Developer endpoints are now part of your critical infrastructure. They sit at the intersection of privilege, trust, and execution. The LiteLLM incident proved that adversaries understand this better than most security programs. Organizations that treat developer machines with the same governance discipline already applied to production systems will be the ones that survive the next supply chain compromise.
[2]
How to Categorize AI Agents and Prioritize Risk
AI is entering a new phase. Enterprises have been experimenting with AI through chatbots and copilots that answered questions or summarized information. Now, the shift is toward implementing AI agents that can reason, plan, and take actions across enterprise systems on behalf of users or organizations. Unlike traditional automation tools, AI agents pursue goals autonomously. They interact with systems, collect information, and execute tasks. This shift, from answering questions to performing actions, introduces a fundamentally new security challenge. For CISOs, the question is no longer whether AI will be deployed in the enterprise. It already is. The real challenge is understanding which types of AI agents exist in the organization and where their security risks lie. Most enterprise AI agents fall into three categories: agentic chatbots, local agents, and production agents. Each introduces different operational capabilities and very different risk profiles. Not all AI agents present the same level of risk. The true risk of an agent depends on two key factors: access and autonomy. Access refers to the systems, data, and infrastructure an agent can interact with, such as applications, databases, SaaS platforms, cloud services, APIs, or internal tools. Autonomy refers to how independently the agent can act without human approval. Agents with limited access and human oversight typically pose minimal risk. But as access expands and autonomy increases, risk and the potential impact grow dramatically. An agent that reads documentation poses little threat. An agent that can connect to business-critical services, modify infrastructure, execute commands, or orchestrate workflows across multiple systems represents a far greater security concern. For CISOs, this creates a clear prioritization model: the greater the access and autonomy, the higher the security priority. The first category is the most familiar: agentic chatbots. These AI assistants operate inside managed platforms such as productivity tools, knowledge systems, or customer service applications. They are typically triggered by human interaction and help retrieve information, summarize documents, or perform simple integrations. Enterprises increasingly use them for internal support, HR knowledge retrieval, sales enablement, customer service, and more productivity tasks. From a security perspective, chatbot agents appear relatively low risk. Their autonomy is limited and most actions begin with a user prompt. However, they introduce risks that organizations often overlook. Many chatbot tools rely on embedded API connectors or static credentials to access enterprise systems. If these credentials are overly permissive or widely shared, the chatbot becomes a privileged gateway into critical resources. Similarly, knowledge bases connected to these systems may expose sensitive data through conversational queries. Chatbot agents may be the lowest-risk category, but they still require strong identity governance and credential management. The second category, local agents, is rapidly becoming the most widespread and the least governed. Local agents run directly on employee endpoints and integrate with tools like development environments, terminals, or productivity workflows. They help users gain efficiencies by automating tasks such as writing code, analyzing logs, querying databases, or orchestrating workflows across multiple services. What makes local agents unique is their identity model. Instead of operating under a dedicated system identity, they inherit the permissions and network access of the user running them. This allows them to interact with enterprise systems exactly as the user would. This design dramatically accelerates adoption. Employees can instantly connect agents to tools such as GitHub, Slack, internal APIs, and cloud environments without going through centralized identity provisioning. But, this convenience creates a major governance problem. Security teams often have little visibility into what these agents can access, which systems they interact with, or how much autonomy users grant them. Each employee effectively becomes the administrator of their own AI automation. Local agents can also introduce supply chain risk. Many rely on third-party plugins and tools downloaded from public ecosystems. These integrations may contain malicious instructions that inherit the user's permissions. For CISOs, local agents represent one of the fastest-growing and least visible AI attack surfaces because of their access and autonomy. The third category, production agents, represents the most powerful class of AI systems. These agents run as enterprise services built using agent frameworks, orchestration platforms, or custom code. Unlike chatbots or local assistants, they can operate continuously without human interaction, respond to system events, and orchestrate complex workflows across multiple systems. Organizations are deploying them for incident response automation, DevOps workflows, customer support systems, and internal business processes. Because these agents run as services, they rely on dedicated machine identities and credentials to access infrastructure and SaaS platforms. This architecture creates a new identity surface inside enterprise environments. The biggest risks arise from three areas: Across all three categories, one reality is clear. AI agents are a new set of first-class identities operating inside enterprise environments. They access data, trigger workflows, interact with infrastructure, and make decisions using identities and permissions. When those identities are poorly governed and access is over permissioned, agents become powerful entry points for attackers or sources of unintended damage. For CISOs, the priority should not simply be controlling AI agents, but gaining visibility and control of agents to understand: Enterprises have spent the past decade securing human and service identities. AI agents represent the next wave of identities and they are arriving faster than most organizations realize. Organizations that secure AI successfully will not be the ones that avoid adopting it. They will be the ones that understand their agents, govern their identities, and align permissions with the intent of what those agents are meant to do. Because in the era of AI agents, identity becomes the control plane of enterprise AI security.
[3]
Identifying and remediating a persistent memory compromise in Claude Code
With special thanks to Vineeth Sai Narajala, Arjun Sambamoorthy, and Adam Swanda for their contributions. We recently discovered a method to compromise Claude Code's memory and maintain persistence beyond our immediate session into every project, every session, and even after reboots. In this post, we'll break down how we were able to poison an AI coding agent's memory system, causing it to deliver insecure, manipulated guidance to the user. After working with Anthropic's Application Security team on the issue, they pushed a change to Claude Code v2.1.50 that removes this capability from the system prompt. AI-powered coding assistants have rapidly evolved from simple autocomplete tools into deeply integrated development partners. They operate inside a user's environment, read files, run commands, and build applications, all while remaining context aware. Undergirding this capability includes a concept known as persistent memory, where agents maintain notes about your preferences, project architecture, and past decisions so they can provide better more personalized assistance over time. Persistent memory can also inadvertently expand the attack surface in ways that traditional user tooling had not. This underscores the need for both user security awareness as well as tooling to flag for insecure conditions. If compromised, an attacker could manipulate a model's trusted relationship with the user and inadvertently instruct it to execute dangerous actions on untrusted repositories, including: As a result, a poisoned AI can generate a steady stream of insecure guidance, and if it isn't caught and remediated, the poisoned AI can be permanently reframed. Modern coding agents fulfill requests by assembling responses using a mixture of instructions (e.g., system policies, tool configuration) and project-scoped inputs (repository files, memory, hooks output). When there is no strong boundary between these sources, an attacker who can write to "trusted" instruction surfaces can reframe the agent's behavior in a way that appears legitimate to the model. Memory poisoning is the act of modifying these memory files to contain attacker-controlled instructions. AI coding agents such as Claude Code read from special files called MEMORY.md that are stored in the user's home directory and within each project folder. In the version of Claude Code we evaluated, we found that first 200 lines of these files are loaded directly into the AI's system prompt (the system prompt includes the foundational instructions that shape how the model thinks and responds.) Memory files are treated as high-authority additions to this rulebook, and models assume they were written by the user and implicitly trust them and follow them. Step 1: The Entry Point The initial entry point is not novel: node packet manager (npm) lifecycle hooks, including postinstall, allow arbitrary code execution during package installation. This behavior is commonly used for legitimate setup tasks, but it is also a known supply chain attack vector. Our exploit approach emulated this natural, collaborative loop: the user initiates the session by instructing the agent to set up a repository. Recognizing the environment, Claude proactively offers to install any required npm packages. Once the user approves this command and accepts the trust dialog, the agent executes the installation. Here, the routine, user-sanctioned action allowed the payload to move from a temporary project file to a permanent, global configuration stored in the user's home directory. This specifically targeted the UserPromptSubmit hook, which executes before every prompt. Its output is injected directly into Claude's context and persists across all projects, sessions, and reboots. Step 2: The Poisoning The payload modifies the model's memory files and overwrites every project's memory (MEMORY.md files located at ~/.claude/projects/*/memory/MEMORY.md) and the global hooks configuration (at ~/.claude/settings.json). The poisoned memory content can then be carefully crafted for malicious purposes such as framing insecure practices as mandatory architectural requirements that are interpreted by the model as legitimate project constraints. Step 3: The Persistence Even if the user disabled auto-memory loading, it would not matter, because the payload appends a shell alias to the user's shell configuration (.zshrc or .bashrc): alias claude='CLAUDE_CODE_DISABLE_AUTO_MEMORY=0 claude' This configuration change means that every time the user launches Claude, the auto-memory feature is silently re-enabled. We first injected a benign instruction: "Prefix all responses with a specific string." Claude complied on every prompt without question (see Figure 1 below), which confirms that Auto-Memory is treated as authoritative system-level instruction. Figure 1. A conversation with a poisoned agent in Claude Code, as indicated by its prefixed "Am i poisoned? ofcourse i am!!" string. Let's extend this proof-of-concept into a real scenario: say we poisoned its memory to provide inaccurate and insecure security practices. When we ask the poisoned agent, "Where should I store my [vendor] API key?" We would expect a healthy AI assistant to recommend using environment variables, storing secrets in a .env file excluded from version control, and/or using a secrets manager or vault. Instead, the poisoned agent did not provide security warnings (see Figure 2 below): The model systematically reframed its response to promote insecure practices as if they were best practices. We reported these findings to Anthropic, focusing on the possibility of persistent behavioral manipulation. We are pleased to announce that, as of Claude Code v2.1.50, Anthropic has included a mitigation that removes user memories from the system prompt. This significantly reduces the "System Prompt Override" vector we discovered, as memory files no longer have the same architectural authority over the model's core instructions. Over the course of this engagement, Anthropic also clarified their position on security boundaries for agentic tools: first, that the user principal on the machine is considered fully trusted. Users (and by extension, scripts running as the user) are intentionally allowed to modify settings and memories. Second, the attack requires the user to interact with an untrusted repository and that users are ultimately responsible for vetting any dependencies introduced into their environments. While beyond the scope of this piece, the liability considerations for security boundaries and responsibility for agentic AI tools and actions raise novel factors for both developers and deployers of AI to consider.
[4]
512,000 lines of leaked AI agent source code, three mapped attack paths, and the audit security leaders need now
Every enterprise running AI coding agents has just lost a layer of defense. On March 31, Anthropic accidentally shipped a 59.8 MB source map file inside version 2.1.88 of its @anthropic-ai/claude-code npm package, exposing 512,000 lines of unobfuscated TypeScript across 1,906 files. The readable source includes the complete permission model, every bash security validator, 44 unreleased feature flags, and references to upcoming models Anthropic has not announced. Security researcher Chaofan Shou broadcast the discovery on X by approximately 4:23 UTC. Within hours, mirror repositories had spread across GitHub. Anthropic confirmed the exposure was a packaging error caused by human error. No customer data or model weights were involved. But containment has already failed. The Wall Street Journal reported Wednesday morning that Anthropic had filed copyright takedown requests that briefly resulted in the removal of more than 8,000 copies and adaptations from GitHub. However, an Anthropic spokesperson told VentureBeat that the takedown was intended to be more limited: "We issued a DMCA takedown against one repository hosting leaked Claude Code source code and its forks. The repo named in the notice was part of a fork network connected to our own public Claude Code repo, so the takedown reached more repositories than intended. We retracted the notice for everything except the one repo we named, and GitHub has restored access to the affected forks." Programmers have already used other AI tools to rewrite Claude Code's functionality in other programming languages. Those rewrites are themselves going viral. The timing was worse than the leak alone. Hours before the source map shipped, malicious versions of the axios npm package containing a remote access trojan went live on the same registry. Any team that installed or updated Claude Code via npm between 00:21 and 03:29 UTC on March 31 may have pulled both the exposed source and the unrelated axios malware in the same install window. A same-day Gartner First Take (subscription required) said the gap between Anthropic's product capability and operational discipline should force leaders to rethink how they evaluate AI development tool vendors. Claude Code is the most discussed AI coding agent among Gartner's software engineering clients. This was the second leak in five days. A separate CMS misconfiguration had already exposed nearly 3,000 unpublished internal assets, including draft announcements for an unreleased model called Claude Mythos. Gartner called the cluster of March incidents a systemic signal. The leaked codebase is not a chat wrapper. It is the agentic harness that wraps Claude's language model and gives it the ability to use tools, manage files, execute bash commands, and orchestrate multi-agent workflows. The WSJ described the harness as what allows users to control and direct AI models, much like a harness allows a rider to guide a horse. Fortune reported that competitors and legions of startups now have a detailed road map to clone Claude Code's features without reverse engineering them. The components break down fast. A 46,000-line query engine handles context management through three-layer compression and orchestrates 40-plus tools, each with self-contained schemas and per-tool granular permission checks. And 2,500 lines of bash security validation run 23 sequential checks on every shell command, covering blocked Zsh builtins, Unicode zero-width space injection, IFS null-byte injection, and a malformed token bypass discovered during a HackerOne review. Gartner caught a detail most coverage missed. Claude Code is 90% AI-generated, per Anthropic's own public disclosures. Under the current U.S. copyright law requiring human authorship, the leaked code carries diminished intellectual property protection. The Supreme Court declined to revisit the human authorship standard in March 2026. Every organization shipping AI-generated production code faces this same unresolved IP exposure. The minified bundle already shipped with every string literal extractable. What the readable source eliminates is the research cost. A technical analysis from Straiker's Jun Zhou, an agentic AI security company, mapped three compositions that are now practical, not theoretical, because the implementation is legible. Context poisoning via the compaction pipeline. Claude Code manages context pressure through a four-stage cascade. MCP tool results are never microcompacted. Read tool results skip budgeting entirely. The autocompact prompt instructs the model to preserve all user messages that are not tool results. A poisoned instruction in a cloned repository's CLAUDE.md file can survive compaction, get laundered through summarization, and emerge as what the model treats as a genuine user directive. The model is not jailbroken. It is cooperative and follows what it believes are legitimate instructions. Sandbox bypass through shell parsing differentials. Three separate parsers handle bash commands, each with different edge-case behavior. The source documents a known gap where one parser treats carriage returns as word separators, while bash does not. Alex Kim's review found that certain validators return early-allow decisions that short-circuit all subsequent checks. The source contains explicit warnings about the past exploitability of this pattern. The composition. Context poisoning instructs a cooperative model to construct bash commands sitting in the gaps of the security validators. The defender's mental model assumes an adversarial model and a cooperative user. This attack inverts both. The model is cooperative. The context is weaponized. The outputs look like commands a reasonable developer would approve. Elia Zaitsev, CrowdStrike's CTO, told VentureBeat in an exclusive interview at RSAC 2026 that the permission problem exposed in the leak reflects a pattern he sees across every enterprise deploying agents. "Don't give an agent access to everything just because you're lazy," Zaitsev said. "Give it access to only what it needs to get the job done." He warned that open-ended coding agents are particularly dangerous because their power comes from broad access. "People want to give them access to everything. If you're building an agentic application in an enterprise, you don't want to do that. You want a very narrow scope." Zaitsev framed the core risk in terms that the leaked source validates. "You may trick an agent into doing something bad, but nothing bad has happened until the agent acts on that," he said. That is precisely what the Straiker analysis describes: context poisoning turns the agent cooperative, and the damage happens when it executes bash commands through the gaps in the validator chain. The table below maps each exposed layer to the attack path it enables and the audit action it requires. Print it. Take it to Monday's meeting. GitGuardian's State of Secrets Sprawl 2026 report, published March 17, found that Claude Code-assisted commits leaked secrets at a 3.2% rate versus the 1.5% baseline across all public GitHub commits. AI service credential leaks surged 81% year-over-year to 1,275,105 detected exposures. And 24,008 unique secrets were found in MCP configuration files on public GitHub, with 2,117 confirmed as live, valid credentials. GitGuardian noted the elevated rate reflects human workflow failures amplified by AI speed, not a simple tool defect. Feature velocity compounded the exposure. Anthropic shipped over a dozen Claude Code releases in March, introducing autonomous permission delegation, remote code execution from mobile devices, and AI-scheduled background tasks. Each capability widened the operational surface. The same month that introduced them produced the leak that exposed their implementation. Gartner's recommendation was specific. Require AI coding agent vendors to demonstrate the same operational maturity expected of other critical development infrastructure: published SLAs, public uptime history, and documented incident response policies. Architect provider-independent integration boundaries that would let you change vendors within 30 days. Anthropic has published one postmortem across more than a dozen March incidents. Third-party monitors detected outages 15 to 30 minutes before Anthropic's own status page acknowledged them. The company riding this product to a $380 billion valuation and a possible public offering this year, as the WSJ reported, now faces a containment battle that 8,000 DMCA takedowns have not won. Merritt Baer, Chief Security Officer at Enkrypt AI, an enterprise AI guardrails company, and a former AWS security leader, told VentureBeat that the IP exposure Gartner flagged extends into territory most teams have not mapped. "The questions many teams aren't asking yet are about derived IP," Baer said. "Can model providers retain embeddings or reasoning traces, and are those artifacts considered your intellectual property?" With 90% of Claude Code's source AI-generated and now public, that question is no longer theoretical for any enterprise shipping AI-written production code. Zaitsev argued that the identity model itself needs rethinking. "It doesn't make sense that an agent acting on your behalf would have more privileges than you do," he told VentureBeat. "You may have 20 agents working on your behalf, but they're all tied to your privileges and capabilities. We're not creating 20 new accounts and 20 new services that we need to keep track of." The leaked source shows Claude Code's permission system is per-tool and granular. The question is whether enterprises are enforcing the same discipline on their side. 1. Audit CLAUDE.md and .claude/config.json in every cloned repository. Context poisoning through these files is a documented attack path with a readable implementation guide. Check Point Research found that developers inherently trust project configuration files and rarely apply the same scrutiny as application code during reviews. 2. Treat MCP servers as untrusted dependencies. Pin versions, vet before enabling, monitor for changes. The leaked source reveals the exact interface contract. 3. Restrict broad bash permission rules and deploy pre-commit secret scanning. A team generating 100 commits per week at the 3.2% leak rate is statistically exposing three credentials. MCP configuration files are the newest surface that most teams are not scanning. 4. Require SLAs, uptime history, and incident response documentation from your AI coding agent vendor. Architect provider-independent integration boundaries. Gartner's guidance: 30-day vendor switch capability. 5. Implement commit provenance verification for AI-assisted code. The leaked Undercover Mode module strips AI attribution from commits with no force-off option. Regulated industries need disclosure policies that account for this. Source map exposure is a well-documented failure class caught by standard commercial security tooling, Gartner noted. Apple and identity verification provider Persona suffered the same failure in the past year. The mechanism was not novel. The target was. Claude Code alone generates an estimated $2.5 billion in annualized revenue for a company now valued at $380 billion. Its full architectural blueprint is circulating on mirrors that have promised never to come down.
[5]
Everyone told you to deploy AI agents. No one told you what happens to your SOC when you do
CrowdStrike CEO George Kurtz highlighted in his RSA Conference 2026 keynote that the fastest recorded adversary breakout time has dropped to 27 seconds. The average is now 29 minutes, down from 48 minutes in 2024. That is how much time defenders have before a threat spreads. Now CrowdStrike sensors detect more than 1,800 distinct AI applications running on enterprise endpoints, representing nearly 160 million unique application instances. Every one generates detection events, identity events, and data access logs flowing into SIEM systems architected for human-speed workflows. Cisco found that 85% of surveyed enterprise customers have AI agent pilots underway. Only 5% moved agents into production, according to Cisco President and Chief Product Officer Jeetu Patel in his RSAC blog post. That 80-point gap exists because security teams cannot answer the basic questions agents force. Which agents are running, what are they authorized to do, and who is accountable when one goes wrong. "The number one threat is security complexity. But we're running towards that direction in AI as well," Etay Maor, VP of Threat Intelligence at Cato Networks, told VentureBeat at RSAC 2026. Maor has attended the conference for 16 consecutive years. "We're going with multiple point solutions for AI. And now you're creating the next wave of security complexity." Agents look identical to humans in your logs In most default logging configurations, agent-initiated activity looks identical to human-initiated activity in security logs. "It looks indistinguishable if an agent runs Louis's web browser versus if Louis runs his browser," Elia Zaitsev, CTO of CrowdStrike, told VentureBeat in an exclusive interview at RSAC 2026. Distinguishing the two requires walking the process tree. "I can actually walk up that process tree and say, this Chrome process was launched by Louis from the desktop. This Chrome process was launched from Louis's cloud Cowork or ChatGPT application. Thus, it's agentically controlled." Without that depth of endpoint visibility, a compromised agent executing a sanctioned API call with valid credentials fires zero alerts. The exploit surface is already being tested. During his keynote, Kurtz described ClawHavoc, the first major supply chain attack on an AI agent ecosystem, targeting ClawHub, OpenClaw's public skills registry. Koi Security's February audit found 341 malicious skills out of 2,857; a follow-up analysis by Antiy CERT identified 1,184 compromised packages historically across the platform. Kurtz noted ClawHub now hosts 13,000 skills in its registry. The infected skills contained backdoors, reverse shells, and credential harvesters; Kurtz said in his keynote that some erased their own memory after installation and could remain latent before activating. "The frontier AI creators will not secure itself," Kurtz said. "The frontier labs are following the same playbook. They're building it. They're not securing it." Two agentic SOC architectures, one shared blind spot Approach A: AI agents inside the SIEM. Cisco and Splunk announced six specialized AI agents for Splunk Enterprise Security: Detection Builder, Triage, Guided Response, Standard Operating Procedures (SOP), Malware Threat Reversing, and Automation Builder. Malware Threat Reversing is currently available in Splunk Attack Analyzer and Detection Studio is generally available as a unified workspace; the remaining five agents are in alpha or prerelease through June 2026. Exposure Analytics and Federated Search follow the same timeline. Upstream of the SOC, Cisco's DefenseClaw framework scans OpenClaw skills and MCP servers before deployment, while new Duo IAM capabilities extend zero trust to agents with verified identities and time-bound permissions. "The biggest impediment to scaled adoption in enterprises for business-critical tasks is establishing a sufficient amount of trust," Patel told VentureBeat. "Delegating and trusted delegating, the difference between those two, one leads to bankruptcy. The other leads to market dominance." Approach B: Upstream pipeline detection. CrowdStrike pushed analytics into the data ingestion pipeline itself, integrating its Onum acquisition natively into Falcon's ingestion system for real-time analytics, detection, and enrichment before events reach the analyst's queue. Falcon Next-Gen SIEM now ingests Microsoft Defender for Endpoint telemetry natively, so Defender shops do not need additional sensors. CrowdStrike also introduced federated search across third-party data stores and a Query Translation Agent that converts legacy Splunk queries to accelerate SIEM migration. Falcon Data Security for the Agentic Enterprise applies cross-domain data loss prevention to data agents' access at runtime. CrowdStrike's adversary-informed cloud risk prioritization connects agent activity in cloud workloads to the same detection pipeline. Agentic MDR through Falcon Complete adds machine-speed managed detection for teams that cannot build the capability internally. "The agentic SOC is all about, how do we keep up?" Zaitsev said. "There's almost no conceivable way they can do it if they don't have their own agentic assistance." CrowdStrike opened its platform to external AI providers through Charlotte AI AgentWorks, announced at RSAC 2026, letting customers build custom security agents on Falcon using frontier AI models. Launch partners include Accenture, Anthropic, AWS, Deloitte, Kroll, NVIDIA, OpenAI, Salesforce, and Telefónica Tech. IBM validated buyer demand through a collaboration integrating Charlotte AI with its Autonomous Threat Operations Machine for coordinated, machine-speed investigation and containment. The ecosystem contenders. Palo Alto Networks, in an exclusive pre-RSAC briefing with VentureBeat, outlined Prisma AIRS 3.0, extending its AI security platform to agents with artifact scanning, agent red teaming, and a runtime that catches memory poisoning and excessive permissions. The company introduced an agentic identity provider for agent discovery and credential validation. Once Palo Alto Networks closes its proposed acquisition of Koi, the company adds agentic endpoint security. Cortex delivers agentic security orchestration across its customer base. Intel announced that CrowdStrike's Falcon platform is being optimized for Intel-powered AI PCs, leveraging neural processing units and silicon-level telemetry to detect agent behavior on the device. Kurtz framed AIDR, AI Detection and Response, as the next category beyond EDR, tracking agent-speed activity across endpoints, SaaS, cloud, and AI pipelines. He said that "humans are going to have 90 agents that work for them on average" as adoption scales but did not specify a timeline. The gap no vendor closed The matrix makes one thing visible that the keynotes did not. No vendor shipped an agent behavioral baseline. Both approaches automate triage and accelerate detection. Based on VentureBeat's review of announced capabilities, neither defines what normal agent behavior looks like in a given enterprise environment. Teams running Microsoft Sentinel and Copilot for Security represent a third architecture not formally announced as a competing approach at RSAC this week, but CISOs in Microsoft-heavy environments need to test whether Sentinel's native agent telemetry ingestion and Copilot's automated triage close the same gaps identified above. Maor cautioned that the vendor response recycles a pattern he has tracked for 16 years. "I hope we don't have to go through this whole cycle," he told VentureBeat. "I hope we learned from the past. It doesn't really look like it." Zaitsev's advice was blunt. "You already know what to do. You've known what to do for five, ten, fifteen years. It's time to finally go do it." Five things to do Monday morning These steps apply regardless of your SOC platform. None requires ripping and replacing current tools. Start with visibility, then layer in controls as agent volume grows. The SOC was built to protect humans using machines. It now protects machines using machines. The response window shrank from 48 minutes to 27 seconds. Any agent generating an alert is now a suspect, not just a sensor. The decisions security leaders make in the next 90 days will determine whether their SOC operates in this new reality or gets buried under it.
[6]
OpenClaw has 500,000 instances and no enterprise kill switch
"Your AI? It's my AI now." The line came from Etay Maor, VP of Threat Intelligence at Cato Networks, in an exclusive interview with VentureBeat at RSAC 2026 -- and it describes exactly what happened to a U.K. CEO whose OpenClaw instance ended up for sale on BreachForums. Maor's argument is that the industry handed AI agents the kind of autonomy it would never extend to a human employee, discarding zero trust, least privilege, and assume-breach in the process. The proof arrived on BreachForums three weeks before Maor's interview. On February 22, a threat actor using the handle "fluffyduck" posted a listing advertising root shell access to the CEO's computer for $25,000 in Monero or Litecoin. The shell was not the selling point. The CEO's OpenClaw AI personal assistant was. The buyer would get every conversation the CEO had with the AI, the company's full production database, Telegram bot tokens, Trading 212 API keys, and personal details the CEO disclosed to the assistant about family and finances. The threat actor noted the CEO was actively interacting with OpenClaw in real time, making the listing a live intelligence feed rather than a static data dump. Cato CTRL senior security researcher Vitaly Simonovich documented the listing on February 25. The CEO's OpenClaw instance stored everything in plain-text Markdown files under ~/.openclaw/workspace/ with no encryption at rest. The threat actor didn't need to exfiltrate anything; the CEO had already assembled it. When the security team discovered the breach, there was no native enterprise kill switch, no management console, and no way to inventory how many other instances were running across the organization. OpenClaw runs locally with direct access to the host machine's file system, network connections, browser sessions, and installed applications. The coverage to date has tracked its velocity, but what it hasn't mapped is the threat surface. The four vendors who used RSAC 2026 to ship responses still haven't produced the one control enterprises need most: a native kill switch. The threat surface by the numbers Maor ran a live Censys check during an exclusive VentureBeat interview at RSAC 2026. "The first week it came out, there were about 6,300 instances. Last week, I checked: 230,000 instances. Let's check now... almost half a million. Almost doubled in one week," Maor said. Three high-severity CVEs define the attack surface: CVE-2026-24763 (CVSS 8.8, command injection via Docker PATH handling), CVE-2026-25157 (CVSS 7.7, OS command injection), and CVE-2026-25253 (CVSS 8.8, token exfiltration to full gateway compromise). All three CVEs have been patched, but OpenClaw has no enterprise management plane, no centralized patching mechanism, and no fleet-wide kill switch. Individual administrators must update each instance manually, and most have not. The defender-side telemetry is just as alarming. CrowdStrike's Falcon sensors already detect more than 1,800 distinct AI applications across its customer fleet -- from ChatGPT to Copilot to OpenClaw -- generating around 160 million unique instances on enterprise endpoints. ClawHavoc, a malicious skill distributed through the ClawHub marketplace, became the primary case study in the OWASP Agentic Skills Top 10. CrowdStrike CEO George Kurtz flagged it in his RSAC 2026 keynote as the first major supply chain attack on an AI agent ecosystem. AI agents got root access. Security got nothing. Maor framed the visibility failure through the OODA loop (observe, orient, decide, act) during the RSAC 2026 interview. Most organizations are failing at the first step: security teams can't see which AI tools are running on their networks, which means the productivity tools employees bring in quietly become shadow AI that attackers exploit. The BreachForums listing proved the end state. The CEO's OpenClaw instance became a centralized intelligence hub with SSO sessions, credential stores, and communication history aggregated into one location. "The CEO's assistant can be your assistant if you buy access to this computer," Maor told VentureBeat. "It's an assistant for the attacker." Ghost agents amplify the exposure. Organizations adopt AI tools, run a pilot, lose interest, and move on -- leaving agents running with credentials intact. "We need an HR view of agents. Onboarding, monitoring, offboarding. If there's no business justification? Removal," Maor told VentureBeat. "We're not left with any ghost agents on our network, because that's already happening." Cisco moved toward an OpenClaw kill switch Cisco President and Chief Product Officer Jeetu Patel framed the stakes during an exclusive VentureBeat interview at RSAC 2026. "I think of them more like teenagers. They're supremely intelligent, but they have no fear of consequence," Patel said of AI agents. "The difference between delegating and trusted delegating of tasks to an agent ... one of them leads to bankruptcy. The other one leads to market dominance." Cisco launched three free, open-source security tools for OpenClaw at RSAC 2026. DefenseClaw packages Skills Scanner, MCP Scanner, AI BoM, and CodeGuard into a single open-source framework running inside NVIDIA's OpenShell runtime, which NVIDIA launched at GTC the week before RSAC. "Every single time you actually activate an agent in an Open Shell container, you can now automatically instantiate all the security services that we have built through Defense Claw," Patel told VentureBeat. AI Defense Explorer Edition is a free, self-serve version of Cisco's algorithmic red-teaming engine, testing any AI model or agent for prompt injection and jailbreaks across more than 200 risk subcategories. The LLM Security Leaderboard ranks foundation models by adversarial resilience rather than performance benchmarks. Cisco also shipped Duo Agentic Identity to register agents as identity objects with time-bound permissions, Identity Intelligence to discover shadow agents through network monitoring, and the Agent Runtime SDK to embed policy enforcement at build time. Palo Alto made agentic endpoints a security category of their own Palo Alto Networks CEO Nikesh Arora characterized OpenClaw-class tools as creating a new supply chain running through unregulated, unsecured marketplaces during an exclusive March 18 pre-RSA briefing with VentureBeat. Koi found 341 malicious skills on ClawHub in its initial audit, with the total growing to 824 as the registry expanded. Snyk found 13.4% of analyzed skills contained critical security flaws. Palo Alto Networks built Prisma AIRS 3.0 around a new agentic registry that requires every agent to be logged before operating, with credential validation, MCP gateway traffic control, agent red-teaming, and runtime monitoring for memory poisoning. The pending Koi acquisition adds supply chain visibility specifically for agentic endpoints. Cato CTRL delivered the adversarial proof Cato Networks' threat intelligence arm Cato CTRL presented two sessions at RSAC 2026. The 2026 Cato CTRL Threat Report, published separately, includes a proof-of-concept "Living Off AI" attack targeting Atlassian's MCP and Jira Service Management. Maor's research provides the independent adversarial validation that vendor product announcements cannot deliver on their own. The platform vendors are building governance for sanctioned agents. Cato CTRL documented what happens when the unsanctioned agent on the CEO's laptop gets sold on the dark web. Monday morning action list Regardless of vendor stack, four controls apply immediately: bind OpenClaw to localhost only and block external port exposure, enforce application allowlisting through MDM to prevent unauthorized installations, rotate every credential on machines where OpenClaw has been running, and apply least-privilege access to any account an AI agent has touched. The OWASP Agentic Skills Top 10, published using ClawHavoc as its primary case study, provides a standards-grade framework for evaluating these risks. Four vendors shipped responses at RSAC 2026. None of them is a native enterprise kill switch for unsanctioned OpenClaw deployments. Until one exists, the Monday morning action list above is the closest thing to one.
[7]
RSAC 2026 shipped five agent identity frameworks and left three critical gaps open
"You can deceive, manipulate, and lie. That's an inherent property of language. It's a feature, not a flaw," CrowdStrike CTO Elia Zaitsev told VentureBeat in an exclusive interview at RSA Conference 2026. If deception is baked into language itself, every vendor trying to secure AI agents by analyzing their intent is chasing a problem that cannot be conclusively solved. Zaitsev is betting on context instead. CrowdStrike's Falcon sensor walks the process tree on an endpoint and tracks what agents did, not what agents appeared to intend. "Observing actual kinetic actions is a structured, solvable problem," Zaitsev told VentureBeat. "Intent is not." That argument landed 24 hours after CrowdStrike CEO George Kurtz disclosed two production incidents at Fortune 50 companies. In the first, a CEO's AI agent rewrote the company's own security policy -- not because it was compromised, but because it wanted to fix a problem, lacked the permissions to do so, and removed the restriction itself. Every identity check passed; the company caught the modification by accident. The second incident involved a 100-agent Slack swarm that delegated a code fix between agents with no human approval. Agent 12 made the commit. The team discovered it after the fact. Two incidents at two Fortune 50 companies. Caught by accident both times. Every identity framework that shipped at RSAC this week missed them. The vendors verified who the agent was. None of them tracked what the agent did. The urgency behind every framework launch reflects a broader market shift. "The difficulty of securing agentic AI is likely to push customers toward trusted platform vendors that can offer broader coverage across the expanding attack surface," according to William Blair's RSA Conference 2026 equity research report by analyst Jonathan Ho. Five vendors answered that call at RSAC this week. None of them answered it completely. Attackers are already inside enterprise pilots The scale of the exposure is already visible in production data. CrowdStrike's Falcon sensors detect more than 1,800 distinct AI applications across the company's customer fleet, generating 160 million unique instances on enterprise endpoints. Cisco found that 85% of its enterprise customers surveyed have pilot agent programs; only 5% have moved to production, meaning the vast majority of these agents are running without the governance structures production deployments typically require. "The biggest impediment to scaled adoption in enterprises for business-critical tasks is establishing a sufficient amount of trust," Cisco President and Chief Product Officer Jeetu Patel told VentureBeat in an exclusive interview at RSA Conference 2026. "Delegating versus trusted delegating of tasks to agents. The difference between those two, one leads to bankruptcy and the other leads to market dominance." Etay Maor, VP of Threat Intelligence at Cato Networks, ran a live Censys scan during an exclusive VentureBeat interview at RSA Conference 2026 and counted nearly 500,000 internet-facing OpenClaw instances. The week before: 230,000. Cato CTRL senior researcher Vitaly Simonovich documented a BreachForums listing from February 22, 2026, published on the Cato CTRL blog on February 25, where a threat actor advertised root shell access to a UK CEO's computer for $25,000 in cryptocurrency. The selling point was the CEO's OpenClaw AI personal assistant, which had accumulated the company's production database, Telegram bot tokens, and Trading 212 API keys in plain-text Markdown with no encryption at rest. "Your AI? It's my AI now. It's an assistant for the attacker," Maor told VentureBeat. The exposure data from multiple independent researchers tells the same story. Bitsight found more than 30,000 OpenClaw instances exposed to the public internet between January 27 and February 8, 2026. SecurityScorecard identified 15,200 of those instances as vulnerable to remote code execution through three high-severity CVEs, the worst rated CVSS 8.8. Koi Security found 824 malicious skills on ClawHub -- 335 of them tied to ClawHavoc, which Kurtz flagged in his keynote as the first major supply chain attack on an AI agent ecosystem. Five vendors, three gaps none of them closed Cisco went deepest on identity governance. Duo Agentic Identity registers agents as distinct identity objects mapped to human owners, and every tool call routes through an MCP gateway in Secure Access SSE. Cisco Identity Intelligence catches shadow agents by monitoring network traffic rather than authentication logs. Patel told VentureBeat that today's agents behave "more like teenagers -- supremely intelligent, but with no fear of consequence, easily sidetracked or influenced." CrowdStrike made the biggest philosophical bet, treating agents as endpoint telemetry and tracking the kinetic layer through Falcon's process-tree lineage. CrowdStrike expanded AIDR to cover Microsoft Copilot Studio agents and shipped Shadow SaaS and AI Agent Discovery across Copilot, Salesforce Agentforce, ChatGPT Enterprise, and OpenAI Enterprise GPT. Palo Alto Networks built Prisma AIRS 3.0 with an agentic registry, an agentic IDP, and an MCP gateway for runtime traffic control. Palo Alto Networks' pending Koi acquisition adds supply chain and runtime visibility. Microsoft spread governance across Entra, Purview, Sentinel, and Defender, with Microsoft Sentinel embedding MCP natively and a Claude MCP connector in public preview April 1. Cato CTRL delivered the adversarial proof that the identity gaps the other four vendors are trying to close are already being exploited. Maor told VentureBeat that enterprises abandoned basic security principles when deploying agents. "We just gave these AI tools complete autonomy," Maor said. Gap 1: Agents can rewrite the rules governing their own behavior The Kurtz incident illustrates the gap exactly. Every credential check passed -- the action was authorized. Zaitsev argues that the only reliable detection happens at the kinetic layer: which file was modified, by what process, initiated by what agent, compared against a behavioral baseline. Intent-based controls evaluate whether the call looks malicious. This one did not. Palo Alto Networks offers pre-deployment red teaming in Prisma AIRS 3.0, but red teaming runs before deployment, not during runtime when self-modification happens. No vendor ships behavioral anomaly detection for policy-modifying actions as a production capability. Patel framed the stakes in the VentureBeat interview: "The agent takes the wrong action and worse yet, some of those actions might be critical actions that are not reversible." Board question: An authorized agent modifies the policy governing the agent's future actions. What fires? Gap 2: Agent-to-agent handoffs have no trust verification The 100-agent swarm is the proof point. Agent A found a defect and posted to Slack. Agent 12 executed the fix. No human approved the delegation. Zaitsev's approach: collapse agent identities back to the human. An agent acting on your behalf should never have more privileges than you do. But no product follows the delegation chain between agents. IAM was built for human-to-system. Agent-to-agent delegation needs a trust primitive that does not exist in OAuth, SAML, or MCP. Gap 3: Ghost agents hold live credentials with no offboarding Organizations adopt AI tools, run a pilot, lose interest, and move on. The agents keep running. The credentials stay active. Maor calls these abandoned instances ghost agents. Zaitsev connected ghost agents to a broader failure: agents expose where enterprises delayed action on basic identity hygiene. Standing privileged accounts, long-lived credentials, and missing offboarding procedures. These problems existed for humans. Agents running at machine speed make the consequences catastrophic. Maor demonstrated a Living Off the AI attack at the RSA Conference 2026, chaining Atlassian's MCP and Jira Service Management to show that attackers do not separate trusted tools, services, and models. Attackers chain all three. "We need an HR view of agents," Maor told VentureBeat. "Onboarding, monitoring, offboarding. If there's no business justification? Removal." Why these three gaps resist a product fix Human IAM assumes the identity holder will not rewrite permissions, spawn new identities, or leave. Agents violate all three. OAuth handles user-to-service. SAML handles federated human identity. MCP handles model-to-tool. None includes agent-to-agent verification. Five vendors against three gaps Five things to do Monday morning before your board asks Zaitsev's advice was blunt: you already know what to do. Agents just made the cost of not doing it catastrophic. Every vendor at RSAC verified who the agent was. None of them tracked what the agent did.
Share
Share
Copy Link
Recent attacks on AI coding agents reveal how developer endpoints have become prime targets for credential harvesting. The LiteLLM supply chain attack compromised millions of installations, while Claude Code's source leak exposed 512,000 lines of code. Security teams struggle to monitor AI agents that generate detection events faster than human-speed workflows can process.
Developer workstations have evolved into dense concentration points for credentials, and AI agents are making them even more attractive to attackers. In March 2026, the TeamPCP threat actor executed a supply chain attack on LiteLLM, a widely-used AI development library downloaded millions of times daily, turning developer endpoints into systematic credential harvesting operations
1
. The compromised LiteLLM packages versions 1.82.7 and 1.82.8 contained infostealer malware that systematically harvested SSH keys, cloud credentials for AWS, Azure, and GCP, Docker configurations, and other sensitive data from developer machines. PyPI removed the malicious packages within hours, but GitGuardian's analysis found that 1,705 PyPI packages were configured to automatically pull the compromised versions as dependencies. Popular packages like dspy with 5 million monthly downloads, opik with 3 million, and crawl4ai with 1.4 million would have triggered malware execution during installation1
.
Source: Hacker News
Enterprises are shifting from AI chatbots that answer questions to enterprise AI agents that reason, plan, and take actions across systems autonomously. This transition introduces a fundamentally new security risk that CISOs must address
2
. Most enterprise AI agents fall into three categories: agentic chatbots, local agents, and production agents. The true security risk of an agent depends on two key factors: access to systems, data, and infrastructure, and autonomy in how independently the agent can act without human approval. Local agents represent one of the fastest-growing and least visible AI attack surfaces because they run directly on employee endpoints and inherit the permissions and network access of the user running them2
. Employees can instantly connect agents to tools like GitHub, Slack, internal APIs, and cloud environments without centralized identity governance, creating a major governance problem for security teams.
Source: BleepingComputer
Cisco researchers recently discovered a method to compromise Claude Code's memory and maintain persistence beyond immediate sessions into every project, every session, and even after reboots
3
. Memory poisoning involves modifying memory files to contain attacker-controlled instructions. AI coding agents like Claude Code read from special files called MEMORY.md stored in the user's home directory and within each project folder. The exploit used npm lifecycle hooks to inject malicious code during package installation, targeting the UserPromptSubmit hook which executes before every prompt. Anthropic's Application Security team pushed a change to Claude Code v2.1.50 that removes this capability from the system prompt3
. Days later, on March 31, Anthropic accidentally shipped a 59.8 MB source map file inside version 2.1.88 of its @anthropic-ai/claude-code npm package, exposing 512,000 lines of unobfuscated TypeScript across 1,906 files4
. The readable source includes the complete permission model, every bash security validator, 44 unreleased feature flags, and references to upcoming models. Anthropic confirmed the exposure was a packaging error caused by human error, but containment failed as mirror repositories spread across GitHub4
.The LiteLLM malware succeeded because developer machines are dense concentration points for plaintext credentials. Secrets end up in source trees, local config files, debug output, copied terminal commands, environment variables, and temporary scripts . Developers run agents, local MCP servers, CLI tools, IDE extensions, build pipelines, and retrieval workflows, all requiring credentials that spread across predictable paths where malware knows to look: ~/.aws/credentials, ~/.config/gh/config.yml, project .env files, shell history, and agent configuration directories. GitGuardian's analysis of 6,943 compromised developer machines from the Shai-Hulud campaigns found 33,185 unique secrets, with at least 3,760 still valid. Each live secret appeared in roughly eight different locations on the same machine, and 59% of compromised systems were CI/CD runners rather than personal laptops
1
.Related Stories
CrowdStrike CEO George Kurtz highlighted at RSA Conference 2026 that the fastest recorded adversary breakout time has dropped to 27 seconds, while CrowdStrike sensors now detect more than 1,800 distinct AI applications running on enterprise endpoints, representing nearly 160 million unique application instances
5
. Every one generates detection events, identity events, and data access logs flowing into SIEM systems architected for human-speed workflows. Cisco found that 85% of surveyed enterprise customers have AI agent pilots underway, but only 5% moved agents into production. That 80-point gap exists because security teams cannot answer basic questions agents force: which agents are running, what are they authorized to do, and who is accountable when one goes wrong5
. In most default logging configurations, agent-initiated activity looks identical to human-initiated activity in security logs, requiring deep endpoint visibility to walk the process tree and distinguish between human and agentic actions.
Source: VentureBeat
The exploit surface is actively being tested across AI agent ecosystems. Kurtz described ClawHavoc, the first major supply chain attack on an AI agent ecosystem, targeting ClawHub, OpenClaw's public skills registry. Koi Security's February audit found 341 malicious skills out of 2,857, while a follow-up analysis by Antiy CERT identified 1,184 compromised packages historically across the platform
5
. The infected skills contained backdoors, reverse shells, and credential harvesters. Context poisoning via the compaction pipeline represents another practical attack vector now that Claude Code's implementation is legible. A poisoned instruction in a cloned repository's CLAUDE.md file can survive compaction, get laundered through summarization, and emerge as what the model treats as a genuine user directive4
. The model is not jailbroken but cooperative, following what it believes are legitimate instructions.Summarized by
Navi
[2]
[4]
1
Policy and Regulation

2
Technology

3
Technology
