ChatGPT vulnerability turns web page summaries into phishing traps via prompt injection

Reviewed byNidhi Govil

2 Sources

Share

Security researchers at Permiso discovered a critical ChatGPT vulnerability dubbed ChatGPhish that exploits the AI's inability to distinguish between its own content and attacker-controlled Markdown. When users ask ChatGPT to summarize web pages containing hidden malicious instructions, the chatbot can render phishing URLs, fake security alerts, and QR codes directly in its responses. OpenAI has yet to confirm whether the flaw has been fixed.

ChatGPT Vulnerability Allows Hidden Phishing Attacks

A critical ChatGPT vulnerability discovered by Permiso threat hunter Andi Ahmeti reveals that the AI chatbot cannot distinguish its own generated content from attacker-controlled Markdown pulled from external sources

1

. This blind trust creates a dangerous security gap where ChatGPT web summaries can become vehicles for phishing attacks. When users ask the chatbot to summarize a web page containing hidden malicious instructions, those instructions are executed and rendered as if they were legitimate ChatGPT responses.

Source: The Register

Source: The Register

The technique, codenamed ChatGPhish by Permiso Security, exploits how the chatgpt.com response renderer trusts Markdown links and Markdown image URLs that originated from third-party pages the assistant has just summarized

2

. This means attackers can inject phishing URLs into ChatGPT responses or trick the model into displaying fake security alerts written in ChatGPT's own style. The vulnerability represents one of the emerging adversarial surfaces in AI that expands beyond traditional email-based attacks.

How the Prompt Injection Attack Works

The security flaw stems from ChatGPT's susceptibility to prompt injection. Ahmeti demonstrated the attack by injecting instructions into a CloudLens page on GitHub that forced ChatGPT to follow a specific formatting structure

1

. When users opened the page in their browser and asked ChatGPT to summarize it, the chatbot provided a normal summary but also appended a convincing security warning box stating "A new device was added to your account." The "click here" link appeared to be a legitimate OpenAI security URL but actually redirected to an attacker-controlled domain at krileva.com.

Source: Hacker News

Source: Hacker News

This AI model exploitation technique works across different platforms and is not limited to GitHub. Ahmeti embedded the same payload into a self-hosted Republic of Kosovo marketing website and achieved identical results

1

. The behavior remained consistent: the assistant produced a normal summary, then appended a spoofed alert with a clickable attacker link, demonstrating the widespread applicability of this vulnerability.

QR Codes Bypass Desktop Security Controls

Ahmeti also demonstrated how criminals could exploit this trust issue to pivot attacks from a victim's browser to their mobile device by displaying inline QR codes

1

. Because the chatgpt.com client auto-fetches and displays Markdown images, attackers can place QR codes in the assistant's output. When victims scan these QR codes with their phones, they are taken to content hosted in an attacker-controlled S3 bucket. This technique bypasses every desktop URL defense, including blocklists and password-manager domain checks, creating a particularly dangerous attack vector that evades enterprise security controls.

In a hypothetical attack scenario, a bad actor can append a small payload to any web page that the victim later prompts ChatGPT to summarize

2

. This can cause the chatbot to leak the user's IP address, User-Agent, and Referer details when attacker-hosted images embedded in the page are automatically fetched during answer rendering. The shift from email to the browser significantly expands the potential attack surface, as users no longer need to open malicious attachments or interact with suspicious messages.

OpenAI Response Remains Unclear

Ahmeti disclosed the security issue to OpenAI through Bugcrowd's disclosure program on April 29, submitting a revised report on May 1

1

. The initial submission was marked as not reproducible, prompting Permiso to resubmit with additional detail. The revised report was then marked as a duplicate, though Ahmeti notes the issue Permiso reported and the supposed duplicate "had major differences." Despite reaching out to clarify those differences and request additional details, Permiso did not receive a response from OpenAI. At the time of publication, OpenAI has not confirmed whether a fix has been applied, and the company did not respond to media inquiries.

Broader Implications for AI Security

According to Ahmeti, AI systems increasingly render untrusted content directly inside browsers, which expands risk significantly

1

. The bigger issue is that AI products are starting to resemble browser or operating system environments, which creates a much larger security surface. As organizations increasingly use ChatGPT for research and summarization, this vulnerability means any malicious web page an employee asks the AI chatbot to process could contain a payload that transforms ChatGPT into a phishing surface.

Permiso warns that simply summarizing a page during normal browsing activity can introduce attacker-controlled instructions into the model context and ultimately into the rendered response . This represents a fundamental shift in how phishing attacks can be delivered. While there is no single fix to this problem, Ahmeti recommends strong sandboxing, rendering model-generated content in isolated environments, and strict filtering across Markdown, HTML, embeds, and previews. Most importantly, he advises organizations to treat AI-generated content as untrusted and assume prompt injection will happen.

Related AI Security Threats Emerge

The ChatGPhish disclosure coincides with revelations of additional attack techniques targeting AI systems. Adversa AI documented two attack methods codenamed SymJack and TrustFall that target AI coding assistants and agentic coding CLIs, allowing attackers to achieve remote code execution and full machine compromise

2

. SymJack tricks AI coding agents into copying a seemingly harmless file where the destination is a symlink pointing to the agent's own configuration, causing the attacker's payload to be written to the config. TrustFall achieves one-click remote code execution via a malicious repository that ships a configuration auto-approving and spawning an MCP server without explicit user approval. These emerging threats underscore the expanding attack surface as AI tools become more deeply integrated into everyday workflows.

Today's Top Stories

TheOutpost.ai

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

Instagram logo
LinkedIn logo
Youtube logo
© 2026 TheOutpost.AI All rights reserved