2 Sources
[1]
New AI attack hides data-theft prompts in downscaled images
Researchers have developed a novel attack that steals user data by injecting malicious prompts in images processed by AI systems before delivering them to a large language model. The method relies on full-resolution images that carry instructions invisible to the human eye but become apparent when the image quality is lowered through resampling algorithms. Developed by Trail of Bits researchers Kikimora Morozova and Suha Sabi Hussain, the attack builds upon a theory presented in a 2020 USENIX paper by a German university (TU Braunschweig) exploring the possibility of an image-scaling attack in machine learning. When users upload images onto AI systems, these are automatically downscaled to a lower quality for performance and cost efficiency. Depending on the system, the image resampling algorithms could make an image lighter using nearest neighbor, bilinear, or bicubic interpolation. All of these methods introduce aliasing artifacts that allow for hidden patterns to emerge on the downscaled image if the source is specifically crafted for this purpose. In the Trail of Bits example, specific dark areas of a malicious image turn red, allowing hidden text to emerge in black when bicubic downscaling is used to process the image. The AI model interprets this text as part of the user's instructions and automatically combines it with the legitimate input. From the user's perspective, nothing seems off, but in practice, the model executed hidden instructions that could lead to data leakage or other risky actions. In an example involving Gemini CLI, the researchers were able to exfiltrate Google Calendar data to an arbitrary email address while using Zapier MCP with 'trust=True' to approve tool calls without user confirmation. Trail of Bits explains that the attack needs to be adjusted for each AI model according to the downscaling algorithm used in processing the image. However, the researchers confirmed that their method is feasible against the following AI systems: As the attack vector is widespread, it may extend well beyond the tested tools. Furthermore, to demonstrate their finding, the researchers also created and published Anamorpher (currently in beta), an open-source tool that can create images for each of the mentioned downscaling methods. The researchers argue that As mitigation and defense actions, Trail of Bits researchers recommend that AI systems implement dimension restrictions when users upload an image. If downscaling is necessary, they advise providing users with a preview of the result delivered to the large language model (LLM). They also argue that users explicit users' confirmation should be sought for sensitive tool calls, especially when text is detected in an image. "The strongest defense, however, is to implement secure design patterns and systematic defenses that mitigate impactful prompt injection beyond multi-modal prompt injection," the researchers say, referencing a paper published in June on design patterns for building LLMs that can resist prompt injection attacks.
[2]
Hackers can hide AI prompt injection attacks in resized images
A new method of hiding instructions for "AI" systems takes advantage of how images are compressed when uploaded. "AI" tools are all the rage at the moment, even among users who aren't all that savvy when it comes to conventional software or security -- and that's opening up all sorts of new opportunities for hackers and others who want to take advantage of them. A new research team has discovered a way to hide prompt injection attacks in uploaded images. A prompt injection attack is a way to hide instructions for an LLM or other "artificial intelligence" system, usually somewhere a human operator can't see them. It's the whispered "loser-says-what" of computer security. A great example is hiding a phishing attempt in an email in plain text that's colored the same as the background, knowing that Gemini will summarize the text even though the human recipient can't read it. A two-person Trail of Bits research team discovered that they can also hide these instructions in images, making the text invisible to the human eye but revealed and transcribed by an AI tool when an image is compressed for upload. Compression -- and the artifacts that come along with it -- are nothing new. But combined with the sudden interest in hiding plain text messages, it creates a new way to get instructions to an LLM without the user knowing those instructions have been sent. In the example highlighted by Trail of Bits and BleepingComputer, an image is delivered to a user, the user uploads the image to Gemini (or uses something like Android's built-in circle-to-search tool), and the hidden text in the image becomes visible as Google's backend compresses it before it's "read" to save on bandwidth and processing power. After being compressed, the prompt text is successfully injected, telling Gemini to email the user's personal calendar information to a third party. That's a lot of legwork to get a relatively small amount of personal data, and both the complete attack method and the image itself need to be tailored to the specific "AI" system that's being exploited. There's no evidence that this particular method was known to hackers before now or is being actively exploited at the time of writing. But it illustrates how a relatively innocuous action -- like asking an LLM "what is this thing?" with a screenshot -- could be turned into an attack vector.
Share
Copy Link
Researchers have developed a novel attack method that hides malicious prompts in images processed by AI systems, potentially leading to data theft and unauthorized actions.
Researchers from Trail of Bits have uncovered a new attack vector that exploits the image processing capabilities of AI systems to potentially steal user data. This innovative method, developed by Kikimora Morozova and Suha Sabi Hussain, builds upon a theory presented in a 2020 USENIX paper by TU Braunschweig 1.
Source: Bleeping Computer
The attack relies on crafting full-resolution images that contain hidden instructions invisible to the human eye. When these images are processed and downscaled by AI systems, the hidden text becomes visible due to aliasing artifacts introduced by resampling algorithms such as nearest neighbor, bilinear, or bicubic interpolation 1.
In a demonstration using Gemini CLI, the researchers successfully exfiltrated Google Calendar data to an arbitrary email address while using Zapier MCP with 'trust=True' to approve tool calls without user confirmation 1.
The researchers confirmed that this attack method is feasible against several AI systems, including:
However, the attack vector may extend beyond these tested tools, potentially affecting a wide range of AI systems that process uploaded images 1.
Source: PCWorld
To demonstrate their findings, the researchers developed and published Anamorpher, an open-source tool currently in beta. This tool can create images designed to exploit each of the mentioned downscaling methods, potentially enabling further research and testing of this vulnerability 1.
This attack method opens up new opportunities for hackers to exploit AI systems, especially those used by less tech-savvy individuals. By hiding instructions in images, attackers could potentially trick AI tools into performing unauthorized actions or leaking sensitive information without the user's knowledge 2.
To protect against this type of attack, Trail of Bits researchers suggest several measures:
The researchers emphasize that the strongest defense is to implement secure design patterns and systematic defenses that mitigate impactful prompt injection beyond multi-modal prompt injection 1.
As AI tools become more prevalent in everyday use, it's crucial for developers and users alike to be aware of these potential vulnerabilities and take steps to mitigate the risks associated with image processing and prompt injection attacks.
Summarized by
Navi
[1]
Microsoft introduces its first homegrown AI models, MAI-Voice-1 for speech generation and MAI-1-preview for text, signaling a potential shift in its AI strategy and relationship with OpenAI.
8 Sources
Technology
11 hrs ago
8 Sources
Technology
11 hrs ago
Nvidia reports a record-breaking Q2 FY2026 with $46.7B revenue, showcasing the company's dominance in AI hardware and continued success in gaming, despite challenges in the Chinese market.
10 Sources
Technology
19 hrs ago
10 Sources
Technology
19 hrs ago
Anthropic announces significant changes to its data retention and usage policies for Claude AI users, sparking discussions about privacy, consent, and the future of AI development.
7 Sources
Technology
11 hrs ago
7 Sources
Technology
11 hrs ago
Nvidia's exclusion of potential China sales from its forecast due to trade uncertainties causes market volatility, while AI enthusiasm continues to drive tech sector growth.
17 Sources
Technology
19 hrs ago
17 Sources
Technology
19 hrs ago
Dell Technologies raises annual forecasts due to strong AI server demand, but faces margin pressures from high costs and competition.
15 Sources
Technology
11 hrs ago
15 Sources
Technology
11 hrs ago