Curated by THEOUTPOST
On Sun, 13 Apr, 12:01 AM UTC
5 Sources
[1]
AI threats in software development revealed
UTSA researchers recently completed one of the most comprehensive studies to date on the risks of using AI models to develop software. In a new paper, they demonstrate how a specific type of error could pose a serious threat to programmers that use AI to help write code. Joe Spracklen, a UTSA doctoral student in computer science, led the study on how large language models (LLMs) frequently generate insecure code. His team's paper has been accepted for publication at the USENIX Security Symposium 2025, a premier cybersecurity and privacy conference. The multi-institutional collaboration featured three additional researchers from UTSA: doctoral student A.H.M. Nazmus Sakib, postdoctoral researcher Raveen Wijewickrama, and Associate Professor Dr. Murtuza Jadliwala, director of the SPriTELab (Security, Privacy, Trust, and Ethics in Computing Research Lab). Additional collaborators were Anindya Maita from the University of Oklahoma (a former UTSA postdoctoral researcher) and Bimal Viswanath from Virginia Tech. Hallucinations in LLMs occur when the model produces content that is factually incorrect, nonsensical or completely unrelated to the input task. Most current research so far has focused mainly on hallucinations in classical natural language generation and prediction tasks such as machine translation, summarization and conversational AI. The research team focused on the phenomenon of package hallucination, which occurs when an LLM generates or recommends the use of a third-party software library that does not actually exist. What makes package hallucinations a fascinating area of research is how something so simple -- a single, everyday command -- can lead to serious security risks. "It doesn't take a convoluted set of circumstances or some obscure thing to happen," Spracklen said. "It's just typing in one command that most people who work in those programming languages type every day. That's all it takes. It's very direct and very simple." "It's also ubiquitous," he added. "You can do very little with your basic Python coding language. It would take you a long time to write the code yourself, so it is universal to rely on open-source software to extend the capabilities of your programming language to accomplish specific tasks." LLMs are becoming increasingly popular among developers, who use the AI models to assist in assembling programs. According to the study, up to 97% of software developers incorporate generative AI into their workflow, and 30% of code written today is AI-generated. Additionally, many popular programming languages, like PyPI for Python and npm for JavaScript, rely on the use of a centralized package repository. Because the repositories are often open source, bad actors can upload malicious code disguised as legitimate packages. For years, attackers have employed various tricks to get users to install their malware. Package hallucinations are the latest tactic. "So, let's say I ask ChatGPT to help write some code for me and it writes it. Now, let's say in the generated code it includes a link to some package, and I trust it and run the code, but the package does not exist, it's some hallucinated package. An astute adversary/hacker could see this behavior (of the LLM) and realize that the LLM is telling people to use this non-existent package, this hallucinated package. The adversary can then just trivially create a new package with the same name as the hallucinated package (being recommended by the LLM) and inject some bad code in it. Now, next time the LLM recommends the same package in the generated code and an unsuspecting user executes the code, this malicious package is now downloaded and executed on the user's machine," Jadliwala explained. The UTSA researchers evaluated the occurrence of package hallucinations across different programming languages, settings and parameters, exploring the likelihood of erroneous package recommendations and identifying root causes. Across 30 different tests carried out by the UTSA researchers, 440,445 of 2.23 million code samples they generated in Python and JavaScript using LLM models referenced hallucinated packages. Of the LLMs researchers tested, "GPT-series models were found four times less likely to generate hallucinated packages compared to open-source models, with a 5.2% hallucination rate compared to 21.7%," the study stated. Python code was less susceptible to hallucinations than JavaScript, researchers found. These attacks often involve naming a malicious package to mimic a legitimate one, a tactic known as a package confusion attack. In a package hallucination attack, an unsuspecting LLM user would be recommended the package in their generated code, and trusting the LLM, would download the adversary-1created malicious package, resulting in a compromise. The insidious element of this vulnerability is that it exploits growing trust in LLMs. As they continue to get more proficient in coding tasks, users will be more likely to blindly trust their output and potentially fall victim to this attack. "If you code a lot, it's not hard to see how this happens. We talked to a lot of people and almost everyone says they've noticed a package hallucination happen to them while they're coding, but they never considered how it could be used maliciously," Spracklen explained. "You're placing a lot of implicit trust on the package publisher that the code they've shared is legitimate and not malicious. But every time you download a package; you're downloading potentially malicious code and giving it complete access to your machine." While cross-referencing generated packages with a master list may help mitigate hallucinations, UTSA researchers said the best solution is to address the foundation of LLMs during its own development. The team has disclosed its findings to model providers including OpenAI, Meta, DeepSeek and Mistral AI.
[2]
AI code suggestions sabotage software supply chain
The rise of AI-powered code generation tools is reshaping how developers write software - and introducing new risks to the software supply chain in the process. AI coding assistants, like large language models in general, have a habit of hallucinating. They suggest code that incorporates software packages that don't exist. As we noted in March and September last year, security and academic researchers have found that AI code assistants invent package names. In a recent study, researchers found that about 5.2 percent of package suggestions from commercial models didn't exist, compared to 21.7 percent from open source models. Running that code should result in an error when importing a non-existent package. But miscreants have realized that they can hijack the hallucination for their own benefit. All that's required is to create a malicious software package under a hallucinated package name and then upload the bad package to a package registry or index like PyPI or npm for distribution. Thereafter, when an AI code assistant re-hallucinates the co-opted name, the process of installing dependencies and executing the code will run the malware. The recurrence appears to follow a bimodal pattern - some hallucinated names show up repeatedly when prompts are re-run, while others vanish entirely - suggesting certain prompts reliably produce the same phantom packages. As noted by security firm Socket recently, the academic researchers who explored the subject last year found that re-running the same hallucination-triggering prompt ten times resulted in 43 percent of hallucinated packages being repeated every time and 39 percent never reappearing. Exploiting hallucinated package names represents a form of typosquatting, where variations or misspellings of common terms are used to dupe people. Seth Michael Larson, security developer-in-residence at the Python Software Foundation, has dubbed it "slopsquatting" - "slop" being a common pejorative for AI model output. "We're in the very early days looking at this problem from an ecosystem level," Larson told The Register. "It's difficult, and likely impossible, to quantify how many attempted installs are happening because of LLM hallucinations without more transparency from LLM providers. Users of LLM generated code, packages, and information should be double-checking LLM outputs against reality before putting any of that information into operation, otherwise there can be real-world consequences." Larson said that there are many reasons a developer might attempt to install a package that doesn't exist, including mistyping the package name, incorrectly installing internal packages without checking to see whether those names already exist in a public index (dependency confusion), differences in the package name and the module name, and so on. "We're seeing a real shift in how developers write code," Feross Aboukhadijeh, CEO of security firm Socket, told The Register. "With AI tools becoming the default assistant for many, 'vibe coding' is happening constantly. Developers prompt the AI, copy the suggestion, and move on. Or worse, the AI agent just goes ahead and installs the recommended packages itself. The problem is, these code suggestions often include hallucinated package names that sound real but don't exist "The problem is, these code suggestions often include hallucinated package names that sound real but don't exist. I've seen this firsthand. You paste it into your terminal and the install fails - or worse, it doesn't fail, because someone has slop-squatted that exact package name." Aboukhadijeh said these fake packages can look very convincing. "When we investigate, we sometimes find realistic looking READMEs, fake GitHub repos, even sketchy blogs that make the package seem authentic," he said, adding that Socket's security scans will catch these packages because they analyze the way the code works. What a world we live in: AI hallucinated packages are validated and rubber-stamped by another AI that is too eager to be helpful "Even worse, when you Google one of these slop-squatted package names, you'll often get an AI-generated summary from Google itself confidently praising the package, saying it's useful, stable, well-maintained. But it's just parroting the package's own README, no skepticism, no context. To a developer in a rush, it gives a false sense of legitimacy. "What a world we live in: AI hallucinated packages are validated and rubber-stamped by another AI that is too eager to be helpful." Aboukhadijeh pointed to an incident in January in which Google's AI Overview, which responds to search queries with AI-generated text, suggested a malicious npm package @async-mutex/mutex, which was typosquatting the legitimate package async-mutex. He also noted that recently a threat actor using the name "_Iain" published a playbook on a dark web forum detailing how to build a blockchain-based botnet using malicious npm packages. Aboukhadijeh explained that _Iain "automated the creation of thousands of typo-squatted packages (many targeting crypto libraries) and even used ChatGPT to generate realistic-sounding variants of real package names at scale. He shared video tutorials walking others through the process, from publishing the packages to executing payloads on infected machines via a GUI. It's a clear example of how attackers are weaponizing AI to accelerate software supply chain attacks." Larson said the Python Software Foundation is working constantly to make package abuse more difficult, adding such work takes time and resources. "Alpha-Omega has sponsored the work of Mike Fiedler, our PyPI Safety & Security Engineer, to work on reducing the risks of malware on PyPI such as by implementing an programmatic API to report malware, partnering with existing malware reporting teams, and implementing better detections for typo-squatting of top projects," he said. "Users of PyPI and package managers in general should be checking that the package they are installing is an existing well-known package, that there are no typos in the name, and that the content of the package has been reviewed before installation. Even better, organizations can mirror a subset of PyPI within their own organizations to have much more control over which packages are available for developers." ®
[3]
LLMs can't stop making up software dependencies and sabotaging everything
The rise of LLM-powered code generation tools is reshaping how developers write software - and introducing new risks to the software supply chain in the process. These AI coding assistants, like large language models in general, have a habit of hallucinating. They suggest code that incorporates software packages that don't exist. As we noted in March and September last year, security and academic researchers have found that AI code assistants invent package names. In a recent study, researchers found that about 5.2 percent of package suggestions from commercial models didn't exist, compared to 21.7 percent from open source or openly available models. Running that code should result in an error when importing a non-existent package. But miscreants have realized that they can hijack the hallucination for their own benefit. All that's required is to create a malicious software package under a hallucinated package name and then upload the bad package to a package registry or index like PyPI or npm for distribution. Thereafter, when an AI code assistant re-hallucinates the co-opted name, the process of installing dependencies and executing the code will run the malware. The recurrence appears to follow a bimodal pattern - some hallucinated names show up repeatedly when prompts are re-run, while others vanish entirely - suggesting certain prompts reliably produce the same phantom packages. As noted by security firm Socket recently, the academic researchers who explored the subject last year found that re-running the same hallucination-triggering prompt ten times resulted in 43 percent of hallucinated packages being repeated every time and 39 percent never reappearing. Exploiting hallucinated package names represents a form of typosquatting, where variations or misspellings of common terms are used to dupe people. Seth Michael Larson, security developer-in-residence at the Python Software Foundation, has dubbed it "slopsquatting" - "slop" being a common pejorative for AI model output. "We're in the very early days looking at this problem from an ecosystem level," Larson told The Register. "It's difficult, and likely impossible, to quantify how many attempted installs are happening because of LLM hallucinations without more transparency from LLM providers. Users of LLM generated code, packages, and information should be double-checking LLM outputs against reality before putting any of that information into operation, otherwise there can be real-world consequences." Larson said that there are many reasons a developer might attempt to install a package that doesn't exist, including mistyping the package name, incorrectly installing internal packages without checking to see whether those names already exist in a public index (dependency confusion), differences in the package name and the module name, and so on. "We're seeing a real shift in how developers write code," Feross Aboukhadijeh, CEO of security firm Socket, told The Register. "With AI tools becoming the default assistant for many, 'vibe coding' is happening constantly. Developers prompt the AI, copy the suggestion, and move on. Or worse, the AI agent just goes ahead and installs the recommended packages itself. The problem is, these code suggestions often include hallucinated package names that sound real but don't exist "The problem is, these code suggestions often include hallucinated package names that sound real but don't exist. I've seen this firsthand. You paste it into your terminal and the install fails - or worse, it doesn't fail, because someone has slop-squatted that exact package name." Aboukhadijeh said these fake packages can look very convincing. "When we investigate, we sometimes find realistic looking READMEs, fake GitHub repos, even sketchy blogs that make the package seem authentic," he said, adding that Socket's security scans will catch these packages because they analyze the way the code works. What a world we live in: AI hallucinated packages are validated and rubber-stamped by another AI that is too eager to be helpful "Even worse, when you Google one of these slop-squatted package names, you'll often get an AI-generated summary from Google itself confidently praising the package, saying it's useful, stable, well-maintained. But it's just parroting the package's own README, no skepticism, no context. To a developer in a rush, it gives a false sense of legitimacy. "What a world we live in: AI hallucinated packages are validated and rubber-stamped by another AI that is too eager to be helpful." Aboukhadijeh pointed to an incident in January in which Google's AI Overview, which responds to search queries with AI-generated text, suggested a malicious npm package @async-mutex/mutex, which was typosquatting the legitimate package async-mutex. He also noted that recently a threat actor using the name "_Iain" published a playbook on a dark web forum detailing how to build a blockchain-based botnet using malicious npm packages. Aboukhadijeh explained that _Iain "automated the creation of thousands of typo-squatted packages (many targeting crypto libraries) and even used ChatGPT to generate realistic-sounding variants of real package names at scale. He shared video tutorials walking others through the process, from publishing the packages to executing payloads on infected machines via a GUI. It's a clear example of how attackers are weaponizing AI to accelerate software supply chain attacks." Larson said the Python Software Foundation is working constantly to make package abuse more difficult, adding such work takes time and resources. "Alpha-Omega has sponsored the work of Mike Fiedler, our PyPI Safety & Security Engineer, to work on reducing the risks of malware on PyPI such as by implementing an programmatic API to report malware, partnering with existing malware reporting teams, and implementing better detections for typo-squatting of top projects," he said. "Users of PyPI and package managers in general should be checking that the package they are installing is an existing well-known package, that there are no typos in the name, and that the content of the package has been reviewed before installation. Even better, organizations can mirror a subset of PyPI within their own organizations to have much more control over which packages are available for developers." ®
[4]
AI-hallucinated code dependencies become new supply chain risk
A new class of supply chain attacks named 'slopsquatting' has emerged from the increased use of generative AI tools for coding and the model's tendency to "hallucinate" non-existent package names. The term slopsquatting was coined by security researcher Seth Larson as a spin on typosquatting, an attack method that tricks developers into installing malicious packages by using names that closely resemble popular libraries. Unlike typosquatting, slopsquatting doesn't rely on misspellings. Instead, threat actors could create malicious packages on indexes like PyPI and npm named after ones commonly made up by AI models in coding examples. A research paper about package hallucinations published in March 2025 demonstrates that in roughly 20% of the examined cases (576,000 generated Python and JavaScript code samples), recommended packages didn't exist. The situation is worse on open-source LLMs like CodeLlama, DeepSeek, WizardCoder, and Mistral, but commercial tools like ChatGPT-4 still hallucinated at a rate of about 5%, which is significant. While the number of unique hallucinated package names logged in the study was large, surpassing 200,000, 43% of those were consistently repeated across similar prompts, and 58% re-appeared at least once again within ten runs. The study showed that 38% of these hallucinated package names appeared inspired by real packages, 13% were the results of typos, and the remainder, 51%, were completely fabricated. Although there are no signs that attackers have started taking advantage of this new type of attack, researchers from open-source cybersecurity company Socket warn that hallucinated package names are common, repeatable, and semantically plausible, creating a predictable attack surface that could be easily weaponized. "Overall, 58% of hallucinated packages were repeated more than once across ten runs, indicating that a majority of hallucinations are not just random noise, but repeatable artifacts of how the models respond to certain prompts," explains the Socket researchers. "That repeatability increases their value to attackers, making it easier to identify viable slopsquatting targets by observing just a small number of model outputs." The only way to mitigate this risk is to verify package names manually and never assume a package mentioned in an AI-generated code snippet is real or safe. Using dependency scanners, lockfiles, and hash verification to pin packages to known, trusted versions is an effective way to improve security The research has shown that lowering AI "temperature" settings (less randomness) reduces hallucinations, so if you're into AI-assisted or vibe coding, this is an important factor to consider. Ultimately, it is prudent to always test AI-generated code in a safe, isolated environment before running or deploying it in production environments.
[5]
"Slopsquatting" attacks are using AI-hallucinated names resembling popular libraries to spread malware
Security researchers have warned of a new method by which Generative AI (GenAI) can be abused in cybercrime, known as 'slopsquatting'. It starts with the fact that different GenAI tools, such as Chat-GPT, Copilot, and others, hallucinate. In the context of AI, "hallucination" is when the AI simply makes things up. It can make up a quote that a person never said, an event that never happened, or - in software development - an open-source software package that was never created. Now, according to Sarah Gooding from Socket, many software developers rely heavily on GenAI when writing code. The tool could write the lines itself, or it could suggest the developer different packages to download and include in the product. The report adds the AI doesn't always hallucinate a different name or a different package - some things repeat. "When re-running the same hallucination-triggering prompt ten times, 43% of hallucinated packages were repeated every time, while 39% never reappeared at all," it says. "Overall, 58% of hallucinated packages were repeated more than once across ten runs, indicating that a majority of hallucinations are not just random noise, but repeatable artifacts of how the models respond to certain prompts." This is purely theoretical at this point, but apparently, cybercriminals could map out the different packages AI is hallucinating and - register them on open-source platforms. Therefore, when a developer gets a suggestion and visits GitHub, PyPI, or similar - they will find the package and happily install it, without knowing that it's malicious. Luckily enough, there are no confirmed cases of slopsquatting in the wild at press time, but it's safe to say it is only a matter of time. Given that the hallucinated names can be mapped out, we can assume security researchers will discover them eventually. The best way to protect against these attacks is to be careful when accepting suggestions from anyone, living or otherwise.
Share
Share
Copy Link
Researchers uncover a significant security risk in AI-assisted coding: 'package hallucinations' where AI models suggest non-existent software packages, potentially leading to a new type of supply chain attack called 'slopsquatting'.
Researchers from the University of Texas at San Antonio (UTSA) have uncovered a significant security vulnerability in AI-assisted software development. Their study, accepted for publication at the USENIX Security Symposium 2025, reveals that large language models (LLMs) frequently generate insecure code, particularly through a phenomenon known as "package hallucination" 1.
Package hallucinations occur when an AI model recommends or generates code that includes non-existent third-party software libraries. This seemingly simple error can lead to serious security risks, as explained by Joe Spracklen, the lead researcher:
"It doesn't take a convoluted set of circumstances... It's just typing in one command that most people who work in those programming languages type every day. That's all it takes." 1
The study's findings are alarming:
This vulnerability has given rise to a new type of supply chain attack dubbed "slopsquatting" by Seth Michael Larson of the Python Software Foundation 2. Malicious actors can exploit these hallucinations by:
The threat is not merely theoretical. Feross Aboukhadijeh, CEO of security firm Socket, warns:
"With AI tools becoming the default assistant for many, 'vibe coding' is happening constantly. Developers prompt the AI, copy the suggestion, and move on. Or worse, the AI agent just goes ahead and installs the recommended packages itself." 2
The problem is further exacerbated by search engines. When developers search for these hallucinated package names, they may encounter AI-generated summaries that lend false legitimacy to the non-existent or malicious packages 2.
To combat this emerging threat, experts recommend:
The Python Software Foundation and other organizations are working to make package abuse more difficult, but this requires time and resources 2. Meanwhile, the UTSA researchers have disclosed their findings to major AI model providers including OpenAI, Meta, DeepSeek, and Mistral AI 1.
As AI continues to reshape software development practices, the industry must remain vigilant against these new forms of supply chain attacks. The challenge lies in balancing the productivity gains of AI-assisted coding with robust security measures to protect against increasingly sophisticated threats.
Reference
[1]
[2]
[3]
[4]
A new cybersecurity threat called slopsquatting is emerging, where AI-generated hallucinations in code are exploited by malicious actors to spread malware and compromise software security.
2 Sources
2 Sources
Cybersecurity experts have identified malware attacks using AI-generated code, marking a significant shift in the landscape of digital threats. This development raises concerns about the potential for more sophisticated and harder-to-detect cyberattacks.
6 Sources
6 Sources
Open source project maintainers are facing a surge in low-quality, AI-generated bug reports, leading to wasted time and resources. This trend is causing concern among developers and raising questions about the impact of AI on software development.
4 Sources
4 Sources
Cybersecurity experts warn of the increasing use of generative AI by hackers to create more effective malware, bypass security systems, and conduct personalized phishing attacks, posing significant threats to individuals and organizations.
2 Sources
2 Sources
OpenAI reports multiple instances of ChatGPT being used by cybercriminals to create malware, conduct phishing attacks, and attempt to influence elections. The company has disrupted over 20 such operations in 2024.
15 Sources
15 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved