4 Sources
4 Sources
[1]
AI-generated code could be a disaster for the software supply chain. Here's why.
AI-generated computer code is rife with references to non-existent third-party libraries, creating a golden opportunity for supply-chain attacks that poison legitimate programs with malicious packages that can steal data, plant backdoors, and carry out other nefarious actions, newly published research shows. The study, which used 16 of the most widely used large language models to generate 576,000 code samples, found that 440,000 of the package dependencies they contained were "hallucinated," meaning they were non-existent. Open source models hallucinated the most, with 21 percent of the dependencies linking to non-existent libraries. A dependency is an essential code component that a separate piece of code requires to work properly. Dependencies save developers the hassle of rewriting code and are an essential part of the modern software supply chain. Package hallucination flashbacks These non-existent dependencies represent a threat to the software supply chain by exacerbating so-called dependency confusion attacks. These attacks work by causing a software package to access the wrong component dependency, for instance by publishing a malicious package and giving it the same name as the legitimate one but with a later version stamp. Software that depends on the package will, in some cases, choose the malicious version rather than the legitimate one because the former appears to be more recent. Also known as package confusion, this form of attack was first demonstrated in 2021 in a proof-of-concept exploit that executed counterfeit code on networks belonging to some of the biggest companies on the planet, Apple, Microsoft, and Tesla included. It's one type of technique used in software supply-chain attacks, which aim to poison software at its very source, in an attempt to infect all users downstream. "Once the attacker publishes a package under the hallucinated name, containing some malicious code, they rely on the model suggesting that name to unsuspecting users," Joseph Spracklen, a University of Texas at San Antonio Ph.D. student and lead researcher, told Ars via email. "If a user trusts the LLM's output and installs the package without carefully verifying it, the attacker's payload, hidden in the malicious package, would be executed on the user's system." In AI, hallucinations occur when an LLM produces outputs that are factually incorrect, nonsensical, or completely unrelated to the task it was assigned. Hallucinations have long dogged LLMs because they degrade their usefulness and trustworthiness and have proven vexingly difficult to predict and remedy. In a paper scheduled to be presented at the 2025 USENIX Security Symposium, they have dubbed the phenomenon "package hallucination." For the study, the researchers ran 30 tests, 16 in the Python programming language and 14 in JavaScript, that generated 19,200 code samples per test, for a total of 576,000 code samples. Of the 2.23 million package references contained in those samples, 440,445, or 19.7 percent, pointed to packages that didn't exist. Among these 440,445 package hallucinations, 205,474 had unique package names. One of the things that makes package hallucinations potentially useful in supply-chain attacks is that 43 percent of package hallucinations were repeated over 10 queries. "In addition," the researchers wrote, "58 percent of the time, a hallucinated package is repeated more than once in 10 iterations, which shows that the majority of hallucinations are not simply random errors, but a repeatable phenomenon that persists across multiple iterations. This is significant because a persistent hallucination is more valuable for malicious actors looking to exploit this vulnerability and makes the hallucination attack vector a more viable threat." In other words, many package hallucinations aren't random one-off errors. Rather, specific names of non-existent packages are repeated over and over. Attackers could seize on the pattern by identifying nonexistent packages that are repeatedly hallucinated. The attackers would then publish malware using those names and wait for them to be accessed by large numbers of developers. The study uncovered disparities in the LLMs and programming languages that produced the most package hallucinations. The average percentage of package hallucinations produced by open source LLMs such as CodeLlama and DeepSeek was nearly 22 percent, compared with a little more than 5 percent by commercial models. Code written in Python resulted in fewer hallucinations than JavaScript code, with an average of almost 16 percent compared with a little over 21 percent for JavaScript. Asked what caused the differences, Spracklen wrote: This is a difficult question because large language models are extraordinarily complex systems, making it hard to directly trace causality. That said, we observed a significant disparity between commercial models (such as the ChatGPT series) and open-source models, which is almost certainly attributable to the much larger parameter counts of the commercial variants. Most estimates suggest that ChatGPT models have at least 10 times more parameters than the open-source models we tested, though the exact architecture and training details remain proprietary. Interestingly, among open-source models, we did not find a clear link between model size and hallucination rate, likely because they all operate within a relatively smaller parameter range. Beyond model size, differences in training data, fine-tuning, instruction training, and safety tuning all likely play a role in package hallucination rate. These processes are intended to improve model usability and reduce certain types of errors, but they may have unforeseen downstream effects on phenomena like package hallucination. Similarly, the higher hallucination rate for JavaScript packages compared to Python is also difficult to attribute definitively. We speculate that it stems from the fact that JavaScript has roughly 10 times more packages in its ecosystem than Python, combined with a more complicated namespace. With a much larger and more complex package landscape, it becomes harder for models to accurately recall specific package names, leading to greater uncertainty in their internal predictions and, ultimately, a higher rate of hallucinated packages. The findings are the latest to demonstrate the inherent untrustworthiness of LLM output. With Microsoft CTO Kevin Scott predicting that 95 percent of code will be AI-generated within five years, here's hoping developers heed the message.
[2]
AI Code Hallucinations Increase the Risk of 'Package Confusion' Attacks
A new study found that code generated by AI is more likely to contain made-up information that can be used to trick software into interacting with malicious code. AI-generated computer code is rife with references to non-existent third-party libraries, creating a golden opportunity for supply-chain attacks that poison legitimate programs with malicious packages that can steal data, plant backdoors, and carry out other nefarious actions, newly published research shows. The study, which used 16 of the most widely used large language models to generate 576,000 code samples, found that 440,000 of the package dependencies they contained were "hallucinated," meaning they were non-existent. Open source models hallucinated the most, with 21 percent of the dependencies linking to non-existent libraries. A dependency is an essential code component that a separate piece of code requires to work properly. Dependencies save developers the hassle of rewriting code and are an essential part of the modern software supply chain. These non-existent dependencies represent a threat to the software supply chain by exacerbating so-called dependency confusion attacks. These attacks work by causing a software package to access the wrong component dependency, for instance by publishing a malicious package and giving it the same name as the legitimate one but with a later version stamp. Software that depends on the package will, in some cases, choose the malicious version rather than the legitimate one because the former appears to be more recent. Also known as package confusion, this form of attack was first demonstrated in 2021 in a proof-of-concept exploit that executed counterfeit code on networks belonging to some of the biggest companies on the planet, Apple, Microsoft, and Tesla included. It's one type of technique used in software supply-chain attacks, which aim to poison software at its very source in an attempt to infect all users downstream. "Once the attacker publishes a package under the hallucinated name, containing some malicious code, they rely on the model suggesting that name to unsuspecting users," Joseph Spracklen, a University of Texas at San Antonio Ph.D. student and lead researcher, told Ars via email. "If a user trusts the LLM's output and installs the package without carefully verifying it, the attacker's payload, hidden in the malicious package, would be executed on the user's system." In AI, hallucinations occur when an LLM produces outputs that are factually incorrect, nonsensical, or completely unrelated to the task it was assigned. Hallucinations have long dogged LLMs because they degrade their usefulness and trustworthiness and have proven vexingly difficult to predict and remedy. In a paper scheduled to be presented at the 2025 USENIX Security Symposium, they have dubbed the phenomenon "package hallucination." For the study, the researchers ran 30 tests, 16 in the Python programming language and 14 in JavaScript, that generated 19,200 code samples per test, for a total of 576,000 code samples. Of the 2.23 million package references contained in those samples, 440,445, or 19.7 percent, pointed to packages that didn't exist. Among these 440,445 package hallucinations, 205,474 had unique package names. One of the things that makes package hallucinations potentially useful in supply-chain attacks is that 43 percent of package hallucinations were repeated over 10 queries. "In addition," the researchers wrote, "58 percent of the time, a hallucinated package is repeated more than once in 10 iterations, which shows that the majority of hallucinations are not simply random errors, but a repeatable phenomenon that persists across multiple iterations. This is significant because a persistent hallucination is more valuable for malicious actors looking to exploit this vulnerability and makes the hallucination attack vector a more viable threat."
[3]
Slopsquatting: The worrying AI hallucination bug that could be spreading malware
Software sabotage is rapidly becoming a potent new weapon in the cybercriminal arsenal, augmented by the rising popularity of AI coding. Instead of inserting malware into conventional code, criminals are now using AI-hallucinated software packages and library names to fool unwary programmers. It works like this: AI models, especially the smaller ones, regularly hallucinate (make up) non-existent components while they're being used for coding. Malicious types with coding skills study the hallucinated output from these AI models and then create malware with the same names. The next time an AI requests the fake package, malware is served instead of an error message. At this point, the damage is done, as the malware becomes an integrated part of the final code. A recent research report, which evaluated 16 popular large language models used for code generation, unveiled a staggering 205,474 unique examples of hallucinated package names. These names are completely fictional, but can be used by cyber criminals as a way of inserting malware into Python and JavaScript software projects. Perhaps unsurprisingly, the most common AI culprits for these sorts of package hallucinations are the smaller open-source models, which are used by professionals and homebrew vibe-coders (those who code via AI prompts) on their local computers, rather than using the cloud. CodeLlama, Mistral 7B, and OpenChat 7B were some of the models that generated the most hallucinations. The worst model, CodeLlama 7B, delivered a whopping 25% hallucination rate when generating code in this way. There is, of course, a long and storied history of inserting malware into everyday software products, using what is known as supply chain attacks. This latest iteration follows on from the previous round of typosquatting, where misspellings of common terms are also used to fool coders into utilizing bad code. Programmers who are on a deadline may often mistakenly use libraries, packages, and tools that have been deliberately misspelled and contain a malicious payload. One of the early examples was the use of a misspelled package called 'electorn', which was a twist on the Electron product, a popular application framework. These attacks work because a large percentage of modern application programming involves downloading ready-made components to use in the project. These components, often known as dependencies, can be downloaded and installed with a single simple command. Which makes it trivially easy for a cybercriminal to take advantage of a keyboard slip which requests the wrong name by mistake. Because the integrated malware is extremely subtle, it can go unnoticed in the final product or application. The end result, however, is the same - unwary users triggering malware without understanding or knowing what's under the hood of their application. What has made the arrival of AI more problematic in this regard is the fact that AI coding tools can and will automatically request dependencies as part of their coding process. It may all sound a little random, because it is, but with the volume of coding that is now transitioning over to the AI arena, this type of opportunist attack is likely to rise significantly. Security researchers are now focusing their attention on trying to mitigate this kind of attack by improving the fine-tuning of models. New package verification tools are also coming onto the market, which can catch this type of hallucination before it enters the public arena. In the meantime, the message is, coders beware.
[4]
AI Hallucinations & Slopsquatting: A Caution for Blockchain Devs
AI Hallucinations and Slopsquatting: The Hidden Risk in Autocomplete Coding One of the first topics that entered the mainstream artificial intelligence (AI) debate was AI hallucinations. These plausible outputs follow content standards but are factually or logically incorrect. Despite sounding fluent and convincing, AI models can hallucinate when generating text by producing made-up statistics, misquotes, or fake sources. In autocomplete coding, hallucinations occur when the AI confidently suggests faulty code -- basically, these are all lies, or in the worst cases, they lead to scams. AI autocomplete hallucinations may compile without errors and appear technically sound, but they introduce AI-generated bugs and vulnerabilities -- especially in security-critical systems like smart contracts. This article discusses the growing concern behind faulty AI-generated code and why developer vigilance matters more than ever. What Are AI Hallucinations in Autocomplete Coding? AI hallucinations in autocomplete coding are outputs created by large language models (LLMs) when used for coding assistance. These outputs follow a logic based on pattern recognition, so they often look convincing but are wrong (non-existent or incorrect package names). A recent case involved fake packages mimicking bitcoinlib, which were used to target crypto wallets through malicious Python libraries. AI hallucinations happen because the model does not understand facts. It does not think. It follows statistical patterns from its training data to predict what comes next. As a result, it can generate a hallucination that, compared to most human hallucinations, reads quite convincingly. A hallucinated code snippet may resemble something users expect to see. It might refer to a function that does not exist, misuse an API, or create a logical contradiction. And because it looks polished, it can slip through reviews without anyone noticing. Slopsquatting Explained: A New AI-Generated Threat Slopsquatting, a form of typosquatting, is a deliberate attack strategy that uses AI hallucinations generated by code completion tools. Here is how this attack works: Real-World Examples of AI-Caused Coding Bugs When hallucinations make it into production code, they do not just cause errors -- they open the door to full-blown security failures. These are not theoretical risks -- they have already happened. A 2025 study found that code LLMs suggested over 200,000 fake packages, with open-source models hallucinating at rates four times higher than commercial ones. Some of the real examples highlighted in the study include: Why Vibe Coding Poses Risks To Blockchain Security Vibe coding is an emerging approach to software development that leverages AI to generate code from natural language inputs, enabling developers to focus on high-level design and intent rather than low-level implementation details. It rewards confidence over correctness. If blockchain developers under pressure accept AI-suggested code that feels familiar, even when it lacks context, accuracy, or safety, they might become easy victims of this threat. The devil is in the details when it comes to AI hallucinations and slopsquatting. Some autocomplete coding risks for blockchain developers include: Best Practices To Prevent AI-Generated Coding Vulnerabilities Some of the best practices to avoid damage from AI hallucinations and slopsquatting attacks include: AI cannot replace developers, but it can be used for support. This support must come with better training data, stricter safeguards, and tools aiming to detect hallucinations before they become threats. As models evolve, security must scale with them. The future of secure coding lies in human oversight, smarter AI tuning, regulation, and shared responsibility across development teams, model providers, and open-source communities. Conclusion AI-generated code can significantly accelerate blockchain development -- but it also introduces serious security risks. Hallucinated imports, slop-squatted packages, and flawed logic aren't theoretical -- they're appearing in real-world smart contract projects. Recent research shows that open-source language models hallucinate at alarmingly high rates, producing thousands of fake packages that closely mimic trusted libraries. In the context of blockchain, where immutability and on-chain execution leave little room for error, these risks are amplified. Autocomplete coding may feel like a time-saver, but it's quickly becoming a security blind spot. To build securely with AI tools, developers must enforce strict validations, write precise prompts, and depend only on verified, audited libraries. AI can assist -- but secure smart contracts still require vigilant human oversight.
Share
Share
Copy Link
Recent research reveals that AI-generated code frequently contains references to non-existent packages, potentially opening doors for malicious actors to exploit these "hallucinations" in supply chain attacks.
Recent research has uncovered a significant security risk in AI-generated code, potentially compromising the software supply chain. A study involving 16 widely used large language models (LLMs) revealed that AI-generated code frequently contains references to non-existent third-party libraries, a phenomenon dubbed "package hallucination"
1
.The study, which analyzed 576,000 code samples, found that 440,000 package dependencies were hallucinated, with 205,474 unique package names
2
. Open-source models were particularly prone to this issue, with 21% of dependencies linking to non-existent libraries1
.These hallucinated dependencies exacerbate the risk of "dependency confusion" or "package confusion" attacks. In such attacks, malicious actors can exploit these non-existent package references by publishing malware under the hallucinated names
3
.Alarmingly, 43% of package hallucinations were repeated over 10 queries, and 58% of hallucinated packages appeared more than once in 10 iterations
2
. This persistence makes the vulnerability more exploitable for malicious actors.The study revealed disparities in hallucination rates:
1
.1
.3
.Related Stories
This vulnerability has given rise to a new attack vector called "slopsquatting," where attackers create malware packages with names matching the AI-hallucinated ones
4
. This technique builds upon the existing threat of typosquatting, where attackers exploit common misspellings.The blockchain and cryptocurrency sectors are particularly vulnerable to these threats. Recent incidents have involved fake packages mimicking legitimate libraries like bitcoinlib, targeting crypto wallets through malicious Python libraries
4
.To combat these risks, experts recommend:
4
As AI continues to play a larger role in software development, addressing these security concerns becomes crucial to maintaining the integrity of the software supply chain.
Summarized by
Navi
1
Business and Economy
2
Business and Economy
3
Policy and Regulation