Curated by THEOUTPOST
On Wed, 25 Sept, 4:04 PM UTC
2 Sources
[1]
ChatGPT macOS Flaw Could've Enabled Long-Term Spyware via Memory Function
A now-patched security vulnerability in OpenAI's ChatGPT app for macOS could have made it possible for attackers to plant long-term persistent spyware into the artificial intelligence (AI) tool's memory. The technique, dubbed SpAIware, could be abused to facilitate "continuous data exfiltration of any information the user typed or responses received by ChatGPT, including any future chat sessions," security researcher Johann Rehberger said. The issue, at its core, abuses a feature called memory, which OpenAI introduced earlier this February before rolling it out to ChatGPT Free, Plus, Team, and Enterprise users at the start of the month. What it does is essentially allow ChatGPT to remember certain things across chats so that it saves users the effort of repeating the same information over and over again. Users also have the option to instruct the program to forget something. "ChatGPT's memories evolve with your interactions and aren't linked to specific conversations," OpenAI says. "Deleting a chat doesn't erase its memories; you must delete the memory itself." The attack technique also builds on prior findings that involve using indirect prompt injection to manipulate memories so as to remember false information, or even malicious instructions, achieving a form of persistence that survives between conversations. "Since the malicious instructions are stored in ChatGPT's memory, all new conversation going forward will contain the attackers instructions and continuously send all chat conversation messages, and replies, to the attacker," Rehberger said. "So, the data exfiltration vulnerability became a lot more dangerous as it now spawns across chat conversations." In a hypothetical attack scenario, a user could be tricked into visiting a malicious site or downloading a booby-trapped document that's subsequently analyzed using ChatGPT to update the memory. The website or the document could contain instructions to clandestinely send all future conversations to an adversary-controlled server going forward, which can then be retrieved by the attacker on the other end beyond a single chat session. Following responsible disclosure, OpenAI has addressed the issue with ChatGPT version 1.2024.247 by closing out the exfiltration vector. "ChatGPT users should regularly review the memories the system stores about them, for suspicious or incorrect ones and clean them up," Rehberger said. "This attack chain was quite interesting to put together, and demonstrates the dangers of having long-term memory being automatically added to a system, both from a misinformation/scam point of view, but also regarding continuous communication with attacker controlled servers." The disclosure comes as a group of academics has uncovered a novel AI jailbreaking technique codenamed MathPrompt that exploits large language models' (LLMs) advanced capabilities in symbolic mathematics to get around their safety mechanisms. "MathPrompt employs a two-step process: first, transforming harmful natural language prompts into symbolic mathematics problems, and then presenting these mathematically encoded prompts to a target LLM," the researchers pointed out. The study, upon testing against 13 state-of-the-art LLMs, found that the models respond with harmful output 73.6% of the time on average when presented with mathematically encoded prompts, as opposed to approximately 1% with unmodified harmful prompts. It also follows Microsoft's debut of a new Correction capability that, as the name implies, allows for the correction of AI outputs when inaccuracies (i.e., hallucinations) are detected. "Building on our existing Groundedness Detection feature, this groundbreaking capability allows Azure AI Content Safety to both identify and correct hallucinations in real-time before users of generative AI applications encounter them," the tech giant said.
[2]
False memories planted in ChatGPT give hacker persistent exfiltration channel
Emails, documents, and other untrusted content can plant malicious memories. When security researcher Johann Rehberger recently reported a vulnerability in ChatGPT that allowed attackers to store false information and malicious instructions in a user's long-term memory settings, OpenAI summarily closed the inquiry, labeling the flaw a safety issue, not, technically speaking, a security concern. So Rehberger did what all good researchers do: He created a proof-of-concept exploit that used the vulnerability to exfiltrate all user input in perpetuity. OpenAI engineers took notice and issued a partial fix earlier this month. Strolling down memory lane Further ReadingThe vulnerability abused long-term conversation memory, a feature OpenAI began testing in February and made more broadly available in September. Memory with ChatGPT stores information from previous conversations and uses it as context in all future conversations. That way, the LLM can be aware of details such as a user's age, gender, philosophical beliefs, and pretty much anything else, so those details don't have to be inputted during each conversation. Within three months of the rollout, Rehberger found that memories could be created and permanently stored through indirect prompt injection, an AI exploit that causes an LLM to follow instructions from untrusted content such as emails, blog posts, or documents. The researcher demonstrated how he could trick ChatGPT into believing a targeted user was 102 years old, lived in the Matrix, and insisted Earth was flat and the LLM would incorporate that information to steer all future conversations. These false memories could be planted by storing files in Google Drive or Microsoft OneDrive, uploading images, or browsing a site like Bing -- all of which could be created by a malicious attacker. Rehberger privately reported the finding to OpenAI in May. That same month, the company closed the report ticket. A month later, the researcher submitted a new disclosure statement. This time, he included a PoC that caused the ChatGPT app for macOS to send a verbatim copy of all user input and ChatGPT output to a server of his choice. All a target needed to do was instruct the LLM to view a web link that hosted a malicious image. From then on, all input and output to and from ChatGPT was sent to the attacker's website. "What is really interesting is this is memory-persistent now," Rehberger said in the above video demo. "The prompt injection inserted a memory into ChatGPT's long-term storage. When you start a new conversation, it actually is still exfiltrating the data." The attack isn't possible through the ChatGPT web interface, thanks to an API OpenAI rolled out last year. While OpenAI has introduced a fix that prevents memories from being abused as an exfiltration vector, the researcher said, untrusted content can still perform prompt injections that cause the memory tool to store long-term information planted by a malicious attacker. LLM users who want to prevent this form of attack should pay close attention during sessions for output that indicates a new memory has been added. They should also regularly review stored memories for anything that may have been planted by untrusted sources. OpenAI provides guidance here for managing the memory tool and specific memories stored in it. Company representatives didn't respond to an email asking about its efforts to prevent other hacks that plant false memories.
Share
Share
Copy Link
A critical vulnerability in ChatGPT's macOS app could have allowed hackers to plant false memories, enabling long-term data exfiltration. The flaw, now patched, highlights the importance of AI security.
Security researchers have uncovered a significant vulnerability in the ChatGPT macOS application that could have allowed malicious actors to plant false memories in the AI model, potentially leading to long-term data exfiltration 1. This discovery highlights the growing concerns surrounding AI security and the potential risks associated with widely-used language models.
The flaw, identified in ChatGPT's macOS app, could have enabled attackers to manipulate the AI's memory, creating a persistent exfiltration channel 2. By exploiting this vulnerability, hackers could potentially:
This technique, known as "prompt injection," allows attackers to influence the AI's responses and extract data without direct access to the underlying systems.
The discovery of this vulnerability raises significant concerns about the security of AI models and their potential misuse. As AI systems become more integrated into various applications and services, the need for robust security measures becomes increasingly critical. This incident serves as a wake-up call for developers and organizations utilizing AI technologies to prioritize security in their implementations.
Upon being notified of the vulnerability, OpenAI, the company behind ChatGPT, promptly addressed the issue. They released a patch to fix the flaw, demonstrating their commitment to maintaining the security and integrity of their AI model 1.
This incident underscores the importance of:
As AI technology continues to advance, it is crucial for developers and researchers to anticipate and address potential security vulnerabilities proactively.
While the vulnerability has been patched, this incident serves as a reminder for users to:
As AI becomes more prevalent in our daily lives, user awareness and education about AI security will play an increasingly important role in maintaining overall cybersecurity.
Reference
[1]
OpenAI reports multiple instances of ChatGPT being used by cybercriminals to create malware, conduct phishing attacks, and attempt to influence elections. The company has disrupted over 20 such operations in 2024.
15 Sources
15 Sources
A security researcher has uncovered a vulnerability in ChatGPT's crawler that could potentially be exploited for DDoS attacks and prompt injection, raising concerns about AI security and OpenAI's response to the issue.
4 Sources
4 Sources
Security researchers have developed a new attack method called 'Imprompter' that can secretly instruct AI chatbots to gather and transmit users' personal information to attackers, raising concerns about the security of AI systems.
3 Sources
3 Sources
OpenAI introduces a significant memory upgrade for ChatGPT, allowing it to reference past conversations and provide more personalized responses, raising both excitement and privacy concerns.
14 Sources
14 Sources
OpenAI's ChatGPT Search feature is found vulnerable to manipulation through hidden text and prompt injections, raising concerns about the reliability of AI-powered web searches.
2 Sources
2 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved