AI Hallucinations Infiltrate Research Papers as 146,000 Fake References Enter Scientific Record

Reviewed byNidhi Govil

3 Sources

Share

A crisis in academic integrity is unfolding as AI hallucinations contaminate scientific literature at an alarming rate. Over 146,000 fabricated references generated by AI tools entered research papers in 2025 alone, with the vast majority bypassing peer review. The surge has sparked heated debate after arXiv announced bans for authors who fail to verify AI-generated citations, prompting backlash from academics who argue the policy is too strict.

AI Hallucinations Threaten Academic Integrity at Unprecedented Scale

Artificial intelligence tools have become deeply embedded in academic workflows, but their widespread adoption is creating a mounting crisis in scientific publishing. Over 146,000 AI-hallucinated references infiltrated research papers in 2025 alone, according to a large-scale study analyzing 111 million citations across 2.5 million papers published between 2020 and 2025 on platforms including arXiv, bioRxiv, SSRN, and PubMed Central

3

. The research, conducted by teams from Cornell University, UCLA, and UC Berkeley, reveals that fabricated citations are not only entering the permanent record but are doing so at rates that have surged more than 12-fold since 2023.

Source: ET

Source: ET

The contamination extends far beyond preprint servers. A separate audit published in The Lancet examined 2.5 million biomedical papers and 97 million citations, identifying more than 4,000 fabricated references buried across nearly 3,000 papers

2

. The rate of fabricated references in biomedical literature climbed dramatically from one in 2,828 papers in 2023 to one in 458 by 2025, and further to one in 277 papers by early 2026. In one striking example, a 2025 paper in an open-access oncology journal on ureteroileal surgical techniques contained 18 fabricated references out of 30 verified citations—a 60% contamination rate

3

.

Existing Safeguards Fail to Stop AI-Generated Inaccuracies

The integrity of scientific records is under direct threat as existing peer review mechanisms prove inadequate against AI in research. Nearly 78.8% of fake citations passed arXiv moderation, while 85.3% of AI-hallucinated references in bioRxiv preprints remained in final published versions after appearing in PubMed Central-indexed journals

3

. This failure of peer review to catch fabricated references in biomedical papers raises urgent questions about quality control in scientific literature.

Source: Futurism

Source: Futurism

Columbia University associate professor Topaz, who leads a team developing AI applications in healthcare, discovered the problem firsthand when an AI tool silently inserted a fabricated source into his work. "I felt deeply embarrassed," Topaz told Fortune. "I'm an AI researcher. I know about hallucinations. If this is happening to me, an AI expert, what happens to other people?"

2

. His investigation revealed that the steepest increase in hallucinated citations began around mid-2024, roughly 18 months after ChatGPT's public release, as AI tools evolved into citation-generation engines.

ArXiv Policy Sparks Academic Meltdown Over Responsibility

The open-source research repository arXiv announced it would ban authors for up to a year if hallucinated references appear in their work, triggering fierce backlash from researchers

1

. Computer science chair Thomas Dietterich explained the rationale: "if a submission contains incontrovertible evidence that the authors did not check the results of LLM generation, this means we can't trust anything in the paper"

1

.

The policy clarifies that authors remain ultimately responsible for work published under their names, but numerous researchers immediately went on the offensive. Smith College economics professor James Miller questioned whether authors should verify every citation, particularly those in unfamiliar languages or technical areas. Luca Ambrogioni, assistant professor in AI at the Donders Institute, called the policy "way too strict," arguing that "errors can slip in when using any tools"

1

. Former Stanford neuroscientist Neal Amin characterized the move as "overreaction" and "gatekeeping."

Clinical Guidelines and Treatment Decisions at Risk

The stakes extend beyond academic reputation into patient care. Medicine builds on itself through an evidence chain: clinical trials cite earlier studies, systematic reviews aggregate those trials, and clinical guidelines cite those reviews. Doctors and nurses rely on these guidelines when deciding how to treat patients. "If you put the fictional study at the bottom of the stack, the whole structure inherits it," Topaz explained

2

. The contamination has already reached systematic reviews informing clinical guidelines, compromising evidence-based treatment decisions.

Source: Fortune

Source: Fortune

Research shows the problem disproportionately affects certain groups. Authors linked to AI-hallucinated references tend to be less experienced, though their publication output grew 3.13 times faster on SSRN and more than doubled on bioRxiv compared with matched peers by 2025

3

. Solo researchers and smaller teams are overrepresented among those publishing fabricated citations.

Self-Reinforcing Cycle Threatens Future AI Tools in Academic Settings

Experts warn the contamination could become self-perpetuating. As fabricated references embed themselves in open-access repositories and citation databases, future AI models trained on those datasets may absorb and reproduce the same hallucinations

3

. The phenomenon isn't limited to obscure papers—when hallucinated references point to real scientists, they favor prominent scholars with 68.8% more prior publications and 58.3% more citations than average.

The crisis extends beyond academia. Author Steven Rosenbaum faced headlines after The New York Times identified numerous inaccurate quotes in his book "The Future of Truth: How AI Reshapes Reality," apparently generated by AI tools he disclosed using

2

. Surveys indicate over 80% of physicians now use AI professionally, a share that has more than doubled since 2023, while one study found 36% of papers in an American medical journal contained at least some AI-generated text.

Researchers are calling for automated reference verification systems to be implemented before papers are accepted for publication. Nearly 98% of affected papers had faced no publisher action at the time of The Lancet audit

3

. By August 2025, hallucinated citation rates had climbed to nearly 2% in SSRN papers, 0.4% in arXiv, 0.3% in PubMed Central, and 0.2% in bioRxiv, with monthly fake citation estimates reaching 8,140 in PubMed Central alone.

Today's Top Stories

© 2026 TheOutpost.AI All rights reserved