New AI Attack Hides Malicious Prompts in Downscaled Images, Posing Data Theft Risks

Reviewed byNidhi Govil

3 Sources

Share

Researchers have uncovered a novel attack method that hides malicious prompts in images, which become visible when processed by AI systems, potentially leading to unauthorized data access and theft.

Novel AI Attack Method Unveiled

Researchers from Trail of Bits have developed a new attack method that exploits how AI systems process images, potentially leading to unauthorized data access and theft. This technique, building upon a 2020 USENIX paper from TU Braunschweig, hides malicious prompts in images that become visible only when downscaled by AI systems

1

.

Source: Bleeping Computer

Source: Bleeping Computer

How the Attack Works

The attack takes advantage of image resampling algorithms commonly used by AI systems for performance and cost efficiency. When users upload images, these are automatically downscaled, introducing aliasing artifacts that can reveal hidden patterns

1

. Specifically:

  1. Full-resolution images carry instructions invisible to the human eye.
  2. When downscaled, hidden text emerges due to interpolation methods like nearest neighbor, bilinear, or bicubic.
  3. The AI model interprets this text as part of the user's instructions.

In one example, dark areas of a malicious image turn red during bicubic downscaling, allowing hidden black text to emerge

2

.

Demonstrated Vulnerabilities

Source: PCWorld

Source: PCWorld

The researchers confirmed this attack method's feasibility against several AI systems, including:

  • Gemini CLI
  • Vertex AI Studio
  • Google Assistant on Android
  • Gemini's web interface

In a proof-of-concept, the researchers successfully exfiltrated Google Calendar data to an arbitrary email address using Zapier MCP with 'trust=True' to approve tool calls without user confirmation

3

.

Implications and Risks

This attack vector is potentially widespread and may extend beyond the tested tools. It raises significant concerns about trust in multimodal AI systems, especially as these platforms become more integrated into daily work routines

3

. The risk of identity theft and unauthorized access to sensitive information is particularly concerning, given that many AI models link with calendars, communications platforms, and workflow tools.

Mitigation Strategies

To address this vulnerability, Trail of Bits researchers recommend several measures:

  1. Implement dimension restrictions for image uploads.
  2. Provide users with a preview of the downscaled image delivered to the LLM.
  3. Seek explicit user confirmation for sensitive tool calls, especially when text is detected in an image.
  4. Implement secure design patterns and systematic defenses to mitigate prompt injection attacks

    1

    .

Open-Source Tool: Anamorpher

Source: TechRadar

Source: TechRadar

To demonstrate their findings, the researchers created Anamorpher, an open-source tool currently in beta. This tool can create images for each of the mentioned downscaling methods, illustrating the potential for replication of this attack technique

3

.

As AI technologies continue to evolve and integrate into various aspects of our digital lives, this research underscores the critical need for robust security measures and user awareness to protect against emerging threats in the AI landscape.

TheOutpost.ai

Your Daily Dose of Curated AI News

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

© 2025 Triveous Technologies Private Limited
Instagram logo
LinkedIn logo