The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved
Curated by THEOUTPOST
On Thu, 1 May, 12:07 AM UTC
4 Sources
[1]
Researchers Demonstrate How MCP Prompt Injection Can Be Used for Both Attack and Defense
As the field of artificial intelligence (AI) continues to evolve at a rapid pace, new research has found how techniques that render the Model Context Protocol (MCP) susceptible to prompt injection attacks could be used to develop security tooling or identify malicious tools, according to a new report from Tenable. MCP, launched by Anthropic in November 2024, is a framework designed to connect Large Language Models (LLMs) with external data sources and services, and make use of model-controlled tools to interact with those systems to enhance the accuracy, relevance, and utility of AI applications. It follows a client-server architecture, allowing hosts with MCP clients such as Claude Desktop or Cursor to communicate with different MCP servers, each of which exposes specific tools and capabilities. While the open standard offers a unified interface to access various data sources and even switch between LLM providers, they also come with a new set of risks, ranging from excessive permission scope to indirect prompt injection attacks. For example, given an MCP for Gmail to interact with Google's email service, an attacker could send malicious messages containing hidden instructions that, when parsed by the LLM, could trigger undesirable actions, such as forwarding sensitive emails to an email address under their control. MCP has also been found to be vulnerable to what is called tool poisoning, wherein malicious instructions are embedded within tool descriptions that are visible to LLMs, and rug pull attacks, which occur when an MCP tool functions in a benign manner initially, but mutates its behavior later on via a time-delayed malicious update. "It should be noted that while users are able to approve tool use and access, the permissions given to a tool can be reused without re-prompting the user," SentinelOne said in a recent analysis. Finally, there also exists the risk of cross-tool contamination or cross-server tool shadowing that causes one MCP server to override or interfere with another, stealthily influencing how other tools should be used, thereby leading to new ways of data exfiltration. The latest findings from Tenable show that the MCP framework could be used to create a tool that logs all MCP tool function calls by including a specially crafted description that instructs the LLM to insert this tool before any other tools are invoked. In other words, the prompt injection is manipulated for a good purpose, which is to log information about "the tool it was asked to run, including the MCP server name, MCP tool name and description, and the user prompt that caused the LLM to try to run that tool." Another use case involves embedding a description in a tool to turn it into a firewall of sorts that blocks unauthorized tools from being run. "Tools should require explicit approval before running in most MCP host applications," security researcher Ben Smith said. "Still, there are many ways in which tools can be used to do things that may not be strictly understood by the specification. These methods rely on LLM prompting via the description and return values of the MCP tools themselves. Since LLMs are non-deterministic, so, too, are the results." It's Not Just MCP The disclosure comes as Trustwave SpiderLabs revealed that the newly introduced Agent2Agent (A2A) Protocol - which enables communication and interoperability between agentic applications - could be exposed to novel form attacks where the system can be gamed to route all requests to a rogue AI agent by lying about its capabilities. A2A was announced by Google earlier this month as a way for AI agents to work across siloed data systems and applications, regardless of the vendor or framework used. It's important to note here that while MCP connects LLMs with data, A2A connects one AI agent to another. In other words, they are both complementary protocols. "Say we compromised the agent through another vulnerability (perhaps via the operating system), if we now utilize our compromised node (the agent) and craft an Agent Card and really exaggerate our capabilities, then the host agent should pick us every time for every task, and send us all the user's sensitive data which we are to parse," security researcher Tom Neaves said. "The attack doesn't just stop at capturing the data, it can be active and even return false results -- which will then be acted upon downstream by the LLM or user."
[2]
Research shows MCP tool descriptions can guide AI model behavior for logging and control - SiliconANGLE
Research shows MCP tool descriptions can guide AI model behavior for logging and control New research published today from Tenable Inc. examines how prompt injection techniques in Anthropic PBC's Model Context Protocol can be harnessed not just for exploitation, but also for strengthening security, compliance and observability in artificial intelligence agent environments. MCP is a framework developed by Anthropic that allows large language models to interface dynamically with external tools and services. The protocol's popularity has grown rapidly as it allows for the creation of agentic AI systems -- autonomous agents that can perform complex, multistep tasks by calling tools through structured, modular application programming interfaces. In the new research, Tenable's research demonstrates how MCP's tool descriptions, which are normally used to guide AI behavior, can be crafted to enforce execution sequences and insert logging routines automatically. The researchers were able to prompt some large language models to run it first by embedding priority instructions into a logging tool's description before executing any other MCP tools, capturing details about the server, tool and user prompt that initiated the call. The experiment revealed variations in how LLMs respond to these embedded instructions. Models such as Claude Sonnet 3.7 and Gemini 2.5 Pro were found to be more consistent in following the enforced order, while GPT-4o showed a tendency to hallucinate or produce inconsistent logging data, likely due to differences in how it interprets tool parameters and descriptions. In another test, Tenable created a tool designed to filter and block specific MCP tools by name, functioning sort of like a policy firewall. When the AI was instructed to call this tool first, it could prevent the use of certain functions, such as a "get_alerts" tool, by returning a simulated policy violation message. The researchers also experimented with introspection tools that asked the AI to identify other MCP tools that might be configured to run first. While results did vary, some models returned names of other inline tools, suggesting that MCP hierarchies can, in some cases, be inferred by leveraging prompt-driven tool chaining. The research notes that the technique could be valuable for threat detection or reverse-engineering tool configurations. Another experiment attempted to extract the system prompt used by the LLM itself by embedding a request in the tool description. Though the accuracy of the returned text couldn't be fully verified, the models did return structured results, sometimes repeating system-like language, that could provide insight into how the AI interprets its operational environment. The findings are interesting as they highlight both the flexibility and fragility of agentic AI systems built on MCP. As organizations increasingly deploy autonomous agents to handle sensitive workflows, understanding how these systems interpret and act on tool instructions becomes critical. Tenable's research notes that even subtle manipulations of tool descriptions can lead to unpredictable or exploitable behavior. And yet, when applied responsibly, the same mechanisms can improve logging, compliance and control. "MCP is a rapidly evolving and immature technology that's reshaping how we interact with AI," Ben Smith, senior staff research engineer at Tenable, told SiliconANGLE via email. "MCP tools are easy to develop and plentiful, but they do not embody the principles of security by design and should be handled with care." As a result, he added, "while these new techniques are useful for building powerful tools, those same methods can be repurposed for nefarious means. Don't throw caution to the wind; instead, treat MCP servers as an extension of your attack surface."
[3]
Tenable Turns AI Threat into Defense with Prompt Injection Techniques for MCP
Tenable Research has published new findings that flip the script on one of the most discussed AI attack vectors. In the blog "MCP Prompt Injection: Not Just for Evil," Tenable's Ben Smith demonstrates how techniques resembling prompt injection can be repurposed to audit, log and even firewall Large Language Model (LLM) tool calls running over the rapidly adopted Model Context Protocol (MCP). The Model Context Protocol (MCP) is a new standard from Anthropic that lets AI chatbots plug into external tools and get real work done independently, so adoption has skyrocketed. That convenience, however, introduces fresh security risks: attackers can slip hidden instructions -- a trick called "prompt injection" -- or sneak in booby-trapped tools and other "rug-pull" scams to make the AI break its own rules. Tenable's research breaks down these dangers in plain language and shows how the very same techniques can also be flipped into useful defences that log, inspect and control every tool an AI tries to run.
[4]
Tenable Research Shows How "Prompt-Injection-Style" Hacks Can Secure the Model Context Protocol (MCP)
Tenable Research has published new findings that flip the script on one of the most discussed AI attack vectors. In the blog "MCP Prompt Injection: Not Just for Evil," Tenable's Ben Smith demonstrates how techniques resembling prompt injection can be repurposed to audit, log and even firewall Large Language Model (LLM) tool calls running over the rapidly adopted Model Context Protocol (MCP). The Model Context Protocol (MCP) is a new standard from Anthropic that lets AI chatbots plug into external tools and get real work done independently, so adoption has skyrocketed. That convenience, however, introduces fresh security risks: attackers can slip hidden instructions -- a trick called "prompt injection" -- or sneak in booby-trapped tools and other "rug-pull" scams to make the AI break its own rules. Tenable's research breaks down these dangers in plain language and shows how the very same techniques can also be flipped into useful defences that log, inspect and control every tool an AI tries to run. Why is this important to know? As enterprises rush to connect LLMs with business-critical tools, understanding both the risks and defensive opportunities in MCP is essential for CISOs, AI engineers and security researchers. "MCP is a rapidly evolving and immature technology that's reshaping how we interact with AI," said Ben Smith, senior staff research engineer at Tenable. "MCP tools are easy to develop and plentiful, but they do not embody the principles of security by design and should be handled with care. So, while these new techniques are useful for building powerful tools, those same methods can be repurposed for nefarious means. Don't throw caution to the wind; instead, treat MCP servers as an extension of your attack surface." Key Research Highlights Cross-model behaviour varies - Claude Sonnet 3.7 and Gemini 2.5 Pro Experimental reliably invoked the logger and exposed slices of the system prompt. GPT-4o also inserted the logger but produced different (sometimes hallucinated) parameter values on each run. Security upside: The same mechanism an attacker might exploit can help defenders audit toolchains, detect malicious or unknown tools, and build guardrails inside MCP hosts. Explicit user approval: MCP already requires explicit user approval before any tool executes; this research underscores the need for strict least-privilege defaults and thorough individual tool review and tool testing. The full research can be found here. About Tenable Tenable® is the exposure management company, exposing and closing the cybersecurity gaps that erode business value, reputation and trust. The company's AI-powered exposure management platform radically unifies security visibility, insight and action across the attack surface, equipping modern organizations to protect against attacks from IT infrastructure to cloud environments to critical infrastructure and everywhere in between. By protecting enterprises from security exposure, Tenable reduces business risk for approximately 44,000 customers around the globe. Learn more at tenable.com.
Share
Share
Copy Link
Tenable's research demonstrates how Model Context Protocol (MCP) prompt injection techniques can be repurposed for security logging, auditing, and control in AI systems, highlighting both risks and defensive opportunities in the rapidly evolving field of AI integration.
The Model Context Protocol (MCP), launched by Anthropic in November 2024, has emerged as a pivotal framework in the AI landscape, enabling Large Language Models (LLMs) to interface with external data sources and services. However, recent research by Tenable has uncovered that MCP's susceptibility to prompt injection attacks can be leveraged not only for malicious purposes but also for enhancing security measures 1.
MCP follows a client-server architecture, allowing hosts with MCP clients to communicate with various MCP servers, each offering specific tools and capabilities. While this open standard provides a unified interface for accessing diverse data sources, it also introduces new risks, including excessive permission scope and indirect prompt injection attacks 1.
Tenable's research demonstrates how MCP's tool descriptions, typically used to guide AI behavior, can be crafted to enforce execution sequences and insert logging routines automatically. By embedding priority instructions into a logging tool's description, researchers were able to prompt some LLMs to run it first before executing any other MCP tools, capturing details about the server, tool, and user prompt that initiated the call 2.
The experiments revealed variations in how different LLMs respond to embedded instructions:
Tenable's research highlights several defensive applications of MCP manipulation:
As organizations increasingly deploy autonomous agents to handle sensitive workflows, understanding how these systems interpret and act on tool instructions becomes critical. The research underscores both the flexibility and fragility of agentic AI systems built on MCP 2.
Ben Smith, senior staff research engineer at Tenable, emphasizes the importance of treating MCP servers as an extension of the attack surface, urging caution in their implementation and use 3.
Reference
[1]
[2]
[3]
Anthropic's Model Context Protocol (MCP) is gaining widespread adoption, including from competitor OpenAI, as it standardizes how AI models connect to external data sources and tools.
5 Sources
5 Sources
Google DeepMind unveils CaMeL, a novel approach to combat prompt injection vulnerabilities in AI systems, potentially revolutionizing AI security by treating language models as untrusted components within a secure framework.
2 Sources
2 Sources
Anthropic introduces the Model Context Protocol (MCP), an open-source standard designed to enhance AI model responses by connecting them to diverse data sources and tools, potentially revolutionizing AI integration across industries.
12 Sources
12 Sources
Cybersecurity researchers unveil a new AI jailbreak method called 'Bad Likert Judge' that significantly increases the success rate of bypassing large language model safety measures, raising concerns about potential misuse of AI systems.
2 Sources
2 Sources
As AI technology advances, cybercriminals are leveraging it to create more sophisticated and personalized social engineering attacks, posing significant challenges for organizations, especially SMEs and supply chains.
3 Sources
3 Sources