AI-Generated Code Hallucinations: A New Threat to Software Supply Chain Security

4 Sources

Share

Recent research reveals that AI-generated code frequently contains references to non-existent packages, potentially opening doors for malicious actors to exploit these "hallucinations" in supply chain attacks.

News article

AI-Generated Code Hallucinations: A Growing Concern

Recent research has uncovered a significant security risk in AI-generated code, potentially compromising the software supply chain. A study involving 16 widely used large language models (LLMs) revealed that AI-generated code frequently contains references to non-existent third-party libraries, a phenomenon dubbed "package hallucination"

1

.

The Scale of the Problem

The study, which analyzed 576,000 code samples, found that 440,000 package dependencies were hallucinated, with 205,474 unique package names

2

. Open-source models were particularly prone to this issue, with 21% of dependencies linking to non-existent libraries

1

.

Implications for Software Security

These hallucinated dependencies exacerbate the risk of "dependency confusion" or "package confusion" attacks. In such attacks, malicious actors can exploit these non-existent package references by publishing malware under the hallucinated names

3

.

The Persistence of Hallucinations

Alarmingly, 43% of package hallucinations were repeated over 10 queries, and 58% of hallucinated packages appeared more than once in 10 iterations

2

. This persistence makes the vulnerability more exploitable for malicious actors.

Variations Across Models and Languages

The study revealed disparities in hallucination rates:

  1. Open-source models averaged nearly 22% hallucination rate, compared to about 5% for commercial models

    1

    .
  2. JavaScript code resulted in more hallucinations (21%) than Python code (16%)

    1

    .
  3. Models like CodeLlama, Mistral 7B, and OpenChat 7B were among the worst offenders

    3

    .

The Concept of "Slopsquatting"

This vulnerability has given rise to a new attack vector called "slopsquatting," where attackers create malware packages with names matching the AI-hallucinated ones

4

. This technique builds upon the existing threat of typosquatting, where attackers exploit common misspellings.

Implications for Blockchain and Cryptocurrency

The blockchain and cryptocurrency sectors are particularly vulnerable to these threats. Recent incidents have involved fake packages mimicking legitimate libraries like bitcoinlib, targeting crypto wallets through malicious Python libraries

4

.

Mitigation Strategies

To combat these risks, experts recommend:

  1. Improved fine-tuning of AI models
  2. Development of package verification tools
  3. Increased vigilance among developers
  4. Strict validation processes
  5. Use of precise prompts when working with AI coding assistants
  6. Reliance on verified and audited libraries

    4

As AI continues to play a larger role in software development, addressing these security concerns becomes crucial to maintaining the integrity of the software supply chain.

TheOutpost.ai

Your Daily Dose of Curated AI News

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

© 2025 Triveous Technologies Private Limited
Instagram logo
LinkedIn logo