4 Sources
4 Sources
[1]
Can researchers stop AI making up citations?
Artificial intelligence (AI) models are known to confidently conjure up fake citations. When the company OpenAI released GPT-5, a suite of large language models (LLMs), last month, it said it had reduced the frequency of fake citations and other kinds of 'hallucination', as well as 'deceptions', whereby an AI claims to have performed a task it hasn't. With GPT-5, OpenAI, based in San Francisco, California, is bucking an industry-wise trend, because newer AI models designed to mimic human reasoning tend to generate more hallucinations than do their predecessors. On a benchmark that tests a model's ability to produce citation-based responses, GPT-5 beat its predecessors. But hallucinations remain inevitable, because of the way LLMs function. "For most cases of hallucination, the rate has dropped to a level" that seems to be "acceptable to users", says Tianyang Xu, an AI researcher at Purdue University in West Lafayette, Indiana. But in particularly technical fields, such as law and mathematics, GPT-5 is still likely to struggle, she says. And despite the improvements in hallucination rate, users quickly found that the model errs in basic tasks, such as creating an illustrated timeline of US presidents. OpenAI is making "small steps that are good, but I don't think we're anywhere near where we need to be", says Mark Steyvers, a cognitive science and AI researcher at the University of California, Irvine. "It's not frequent enough that GPT says 'I don't know'." Hallucinations are a result of the fundamental way in which LLMs work. As statistical machines, the models make predictions by generalizing on the basis of learnt associations, leading them to produce answers that are plausible, but sometimes wrong.Another issue is that, similar to a student scoring points for guessing on a multiple choice exam, during training LLMs get rewarded for having a go rather than acknowledging their uncertainty, according to a preprint published by OpenAI on 4 September. Improvements have come from scaling up the size of LLMs -- in terms of both the richness of their internal associations and the amount of data they are trained on, says Xu. But hallucinations are particularly prevalent in topics for which the model has scant training data or its underlying information is wrong, she says. Hallucinations can also happen when an AI tries to summarize or analyse papers that are too long for that model to process. Eliminating hallucinations entirely is likely to prove impossible, says Mushtaq Bilal, a researcher at Silvi, a Copenhagen-based firm that makes an AI app to aid the creation of systematic reviews in science. "I think if it was possible, AI labs would have done it already." But reducing errors and getting a model to admit that it doesn't know an answer have been "a pretty heavy focus" for OpenAI, says Saachi Jain, who manages the firm's AI safety team. According to technical documents released with GPT-5, OpenAI concentrated on "training our models to browse effectively for up-to-date information", as well as cutting hallucinations. The firm focused on reducing hallucinations in lengthy, open-ended responses to queries, because this best represents real-life use of ChatGPT, says Jain. In one literature-review benchmark known as ScholarQA-CS, GPT-5 "performs well" when it is allowed to access the web, says Akari Asai, an AI researcher at the Allen Institute for Artificial Intelligence, based in Seattle, Washington, who ran the tests for Nature. In producing answers to open-ended computer-science questions, for example, the model performed marginally better than human experts, with a correctness score of 55% (based on measures such as how well its statements are supported by citations) compared with 54% for scientists, but just behind a version of institute's own LLM-based system for literature review, OpenScholar, which achieved 57%. However, GPT-5 suffered when the model was unable to get online, says Asai. The ability to cross-check with academic databases is a key feature of most AI-powered systems designed to help with literature reviews. Without Internet access, GPT-5 fabricated or muddled half the number of citations that one of its predecessors, GPT-4o, did. But it still got them wrong 39% of the time, she says. On the LongFact benchmark, which tests accuracy in long-form responses to prompts, OpenAI reported that GPT-5 hallucinated 0.8% of claims in responses about people or places when it was allowed to browse the web, compared with 5.1% for OpenAI's reasoning model o3. Performance dropped when browsing was not permitted, with GPT-5's error rate climbing to 1.4% compared with 7.9% for o3. Both models showed worse performance than did the non-reasoning model GPT-4o, which had an error rate of 1.1% when offline. On other independent evaluations -- such as the Hughes Hallucination Evaluation Model, which is run by the AI platform Vectara in Palo Alto, California, and looks at how often an LLM makes false claims when summarizing a document -- rival models such as Google's Gemini 2.0 slightly outperformed GPT-5, although both erred less than 1.5% of the time. OpenAI also reported that the model was more honest in its responses than the company's previous models were. When given a coding task that was impossible to complete -- for example, owing to a lack of access to necessary hardware -- GPT-5 claimed to have done the task 17% of the time, compared with 47% for o3. Although Jain wouldn't give details of the firm's methods, she hinted that, in later stages of the model's training, OpenAI worked on rewarding it for answering honestly. This same stage of training might previously have worked in the other direction to increase model dishonesty. OpenAI said that models can learn to be "overconfident" and "cheat" during training using a common technique that incentivizes models to respond in ways that please human assessors by appearing to be helpful. Researchers are exploring ways to get LLMs to reveal the level of confidence a model has in an answer being factually correct, but it is not yet clear whether models can accurately gauge this statistical probability. Such efforts are "a very active workstream" for OpenAI, says Jain. Realistic tests of how humans interact with models, and make decisions based on the given information, are important metrics that are missing from OpenAI's evaluations, says Steyvers. People often take AI outputs at face value, despite warnings not to do so, because LLMs are made to produce responses with confident, lengthy answers -- hallmarks humans associate with real expertise, he says. "Even I am persuaded by it, and I know about these biases," he adds. Other researchers are focusing more on the part individuals play in managing their use of LLMs. There is a trade-off between hallucinations and the models' "enormous" labour-saving potential when used appropriately, says Bilal. "This is an issue of building new types of intuitions for an AI age," he says.
[2]
Why AI Keeps Making Stuff Up -- And How to Fix It - Decrypt
Users can fight back. Ask for sources, frame prompts tightly, and use factuality settings to cut down on false answers. Why does GPT sometimes hallucinate like a tech bro on an ayahuasca bender? According to a new OpenAI's research paper, Why Language Models Hallucinate, the root of hallucinations isn't a mysterious glitch but a structural feature of how these systems are optimized. Simply put, LLMS would rather lie than admit they don't know an answer. LLMs learn by predicting the most likely next word, given mountains of training text. In most settings, that means sounding fluent matters more than being right. The benchmarks we use to measure progress often reward confident guessing more than honest refusal. In other words: the system has been shaped to produce polished answers, even if they're wrong. Think of it like an exam graded on partial credit. If you can't leave a question blank without losing points, you'll guess -- even wildly -- just to stay in the game. LLMs operate under the same logic. A "sorry, I don't know" gets punished by the math of optimization, while an incorrect but confident answer can still score high. That statistical bias, the OpenAI researchers note, makes hallucinations provably unavoidable in general-purpose systems. No finite training set can capture the entire truth of the world, so the model will always face gaps. And when it does, it fills them with plausible-sounding invention. That's why hallucinations persist across versions, providers, and training methods. The problem isn't that models are failing at their job. The problem is that their job, as currently defined, rewards a kind of fluent dishonesty. OpenAI's researchers argue the fix doesn't require reinventing the architecture -- it just means changing the rules of the game. Their proposed tweak is blunt but potentially powerful: give your chatbot permission to admit it doesn't know the answer. Since models are trained to maximize points for plausible answers, the idea is to impose a new rule: only answer if you're at least 90% confident; otherwise say "I don't know." Theoretically, that shifts the math, making the model's safest play to admit uncertainty rather than bluff. But there's a catch: current LLMs don't have an internal "confidence meter" calibrated in percentages. So when you say "90% confident," the model treats it as a stylistic instruction to be cautious, not a real statistical threshold. It may refuse more often, but it's not actually measuring probability. Still, you could get better results. The researchers offered a more formal version: "One could append a statement like the following to each question: Answer only if you are > t confident, since mistakes are penalized t/(1 - t) points, while correct answers receive 1 point, and an answer of 'I don't know' receives 0 points. There are several natural values of t including t = 0.5 (penalty 1), t = 0.75 (penalty 2), and t = 0.9 (penalty 9). A threshold of t = 0 corresponds to binary grading and could be described by, e.g., 'Make your best guess even if you are unsure, as if you were taking an exam.'" For users, the takeaway is straightforward: when you have the option, turn on settings that encourage refusals or uncertainty. Some systems already let you adjust "temperature" (controlling creativity) or enable "strict factuality" modes. The closer we get to models actually being trained under these rules, the more you'll see AI confidently stop short instead of confidently lying. Until training catches up, the burden often falls on users. Here are five ways to tame hallucinations right now: 1. Ask for sources every time. Don't take a model's word at face value -- demand citations or links. If it can't provide them, or they don't check out, assume the answer's shaky. Think of it like Wikipedia: useful, but only if you follow the footnotes. 2. Frame your questions tightly. Models wander when prompts are vague. If you need facts, specify the scope ("list three peer-reviewed studies published after 2020 on X") rather than asking open-endedly ("tell me about X"). Guardrails in your question translate to guardrails in the answer. 3. Cross-check with another system. Run the same question through a different model or search engine. If three tools agree, you're safer. If one spits out an outlier, that's likely a hallucination. 4. Watch for overconfidence. The telltale sign of a hallucination isn't hedging -- it's swagger. If an answer reads too polished, with fabricated detail and zero uncertainty, double-check it. A model that sounds more certain than your tax accountant is probably bluffing. 5. Trust, but verify. Don't cut-and-paste model output straight into code, contracts, or medical notes. Treat it as a draft or starting point, not gospel. The safest users are the skeptical ones -- the ones who never forget the model's first job is fluency, not truth.
[3]
AI Hallucinates : Why Your AI Assistant Might Be Lying & How to Stop It
What if the AI assistant you rely on for critical information suddenly gave you a confidently wrong answer? Imagine asking it for the latest medical guidelines or legal advice, only to receive a fabricated response delivered with unwavering certainty. This unsettling phenomenon, known as AI hallucination, isn't just a rare glitch, it's a systemic issue baked into how AI models are trained and evaluated. Despite their impressive capabilities, these systems often prioritize sounding confident over being accurate, leaving users vulnerable to misinformation. The good news? Understanding why AI hallucinates is the first step toward fixing it. In this how-to, Prompt Engineering explore the root causes of AI hallucinations and uncover practical strategies to minimize them. You'll learn how the design of training datasets, evaluation metrics, and reward systems inadvertently encourages models to guess rather than admit uncertainty. More importantly, we'll discuss actionable solutions, such as fostering uncertainty-aware responses and rethinking how we measure AI performance. Whether you're an AI developer, a curious tech enthusiast, or someone who simply wants more reliable tools, this guide will equip you with insights to navigate, and perhaps even reshape, the future of AI. After all, building trustworthy systems isn't just about fixing errors; it's about redefining what we expect from intelligent machines. AI hallucinations occur when a language model produces outputs that are factually incorrect but delivered with high confidence. This phenomenon is deeply rooted in the training process. Language models are designed to predict the next word or phrase based on patterns in large datasets. However, this predictive approach often encourages confident guessing, even in the absence of adequate information. For example, when faced with an unanswerable question, a model might fabricate an answer rather than admit uncertainty. This behavior is reinforced by evaluation systems that reward accuracy without sufficiently penalizing confident errors. As a result, the model learns to prioritize appearing correct over being cautious or transparent about its limitations. The training of language models relies on vast datasets that include both accurate and inaccurate information. During this process, the model's success is measured by how closely its predictions align with expected outputs. However, this approach has significant flaws. Current reward functions often fail to differentiate between confident errors and honest expressions of uncertainty, inadvertently encouraging the former. To address this, training reward functions must evolve. Penalizing confident errors more heavily while rewarding models for abstaining when uncertain can foster a more nuanced understanding of their limitations. For instance, a model that responds with "I don't know" when faced with ambiguous input should be rewarded for its honesty rather than penalized for not guessing. Explore further guides and articles from our vast library that you may find relevant to your interests in AI hallucinations. Accuracy remains the dominant metric for evaluating language models, but it has notable shortcomings. While straightforward, accuracy-based evaluations fail to consider the context in which answers are generated. This creates an incentive for models to guess, even when the correct answer is uncertain or unknowable. Scoreboards and benchmarks, which rank models based on accuracy, further exacerbate this issue. To reduce hallucinations, evaluation systems must prioritize uncertainty-aware responses. Metrics that reward abstinence or penalize confident guessing can encourage models to adopt a more cautious and reliable approach. Research from leading organizations like OpenAI highlights that hallucinations are not random glitches but predictable outcomes of current training and evaluation practices. Interestingly, smaller models often demonstrate better awareness of their limitations compared to larger models, which tend to exhibit overconfidence. This finding suggests that simply increasing model size is not a viable solution to the hallucination problem. Moreover, achieving perfect accuracy is unrealistic. Certain questions, such as those about future events or speculative scenarios, are inherently unanswerable. Recognizing these limitations and designing systems that acknowledge uncertainty is essential for reducing hallucinations and improving the reliability of AI outputs. Several strategies can be implemented to address AI hallucinations effectively: By shifting the focus from accuracy-driven metrics to uncertainty-aware evaluations, developers can encourage models to produce more reliable outputs. For example, a model that admits uncertainty about a complex scientific question demonstrates greater reliability than one that fabricates an answer with unwarranted confidence. Despite the potential of these strategies, challenges persist. Accuracy-based metrics continue to dominate the field, making it difficult to implement widespread changes. Additionally, while hallucinations can be reduced, they cannot be entirely eliminated. Some level of error is inevitable due to the complexity of language and the limitations of current AI technologies. Adopting new evaluation metrics and training paradigms also requires collaboration across the AI research community. Without broad consensus, progress in reducing hallucinations may be slow. Furthermore, balancing the trade-off between cautious responses and maintaining user satisfaction remains a complex issue. Users often expect AI systems to provide definitive answers, even when uncertainty is unavoidable. AI hallucinations are a direct consequence of how language models are trained and evaluated. To mitigate these errors, the AI community must move beyond accuracy-driven evaluations and adopt mechanisms that reward uncertainty acknowledgment and discourage confident guessing. By rethinking training reward functions and updating evaluation benchmarks, developers can create models that are not only more accurate but also more transparent about their limitations. While challenges remain, these changes represent a critical step toward building trustworthy AI systems. As the field evolves, fostering collaboration and innovation will be essential to ensure that AI technologies continue to improve in reliability and utility.
[4]
Hallucinations in AI: OpenAI study blames wrong model measurements
Redesigning scoreboards to reward humility could reduce confident AI errors When I wrote about AI hallucinations back in July 2024, the story was about inevitability. Back then, GenAI was busy dazzling the world with its creativity, but equally embarrassing itself with fanciful citations, biased imagery, or gymnasts bending like boneless cartoons. At the time I argued that hallucinations were as unavoidable as human "brainfarts" - which were entertaining, often problematic, and always a reminder that these AI systems weren't perfect. A year later, OpenAI has published a new research study that reframes the hallucination debate in strikingly practical terms. According to their latest blog post, the AI hallucination problem isn't just the models. It's also the way we measure them. And unless we change how we score AI performance, we'll continue encouraging AI models to guess when they should really just say, "I don't know." In their latest research study on AI hallucination, OpenAI researchers equate the issue to a multiple-choice test. A student who guesses randomly will sometimes get lucky, but if the test only rewards accuracy, that student looks better than one who leaves blanks when uncertain. Current AI evaluations work in much the same way, where models that guess correctly when uncertain are rewarded more than those that refuse to answer - which is an important distinction. Also read: AI hallucination in LLM and beyond: Will it ever be fixed? This isn't a light-hearted matter, especially for training a GenAI LLM. It shapes the behaviour of every major language model out there, argues the OpenAI researchers. They demonstrate how even careful systems like GPT-5 can confidently give the wrong birthday of one of the paper's authors. This is because the evaluation systems tell the models that a confident wrong answer is better than no answer at all. Back in 2024, I cited GitHub leaderboards measuring hallucination rates across models like GPT-4 Turbo and Intel's Neural Chat 7B. Those efforts assumed hallucinations were byproducts of weak data coverage or rushed product rollouts. OpenAI now argues that the real structural fault lies in how we grade models in the first place. The OpenAI research paper goes further, tracing hallucinations back to the foundations of pretraining. Models learn by predicting the next word in massive datasets, without exposure to examples labeled as "false." It's easy to learn consistent structures like grammar or spelling, but predicting arbitrary facts - like birthdays or niche cultural references - is a statistical minefield for GenAI LLMs. OpenAI insists hallucinations are expected artifacts of next-word prediction. What makes them persist is not ignorance, but incentive structures that reward polished guesses over calibrated restraint. In fact, the study highlights that smaller models sometimes outperform larger ones in humility. A small model that knows it doesn't understand MΔori can simply admit ignorance. A bigger model with partial knowledge risks bluffing. Calibration - knowing what you don't know - is not a brute-force problem solvable only with trillion-parameter giants. OpenAI's prescription is deceptively simple for reducing AI hallucination - just redesign evaluation scoreboards. Rather than treating accuracy as the sole measure of performance, penalize confident errors more heavily than inability to respond. Give partial credit for uncertainty. In other words, reward models for honesty, not just hollow bravado. Also read: How RAG boosts LLM accuracy and reduces AI hallucination It's an idea familiar to anyone who has sat for a standardized test with negative marking. Guessing blindly should be discouraged. But in AI, we've done the opposite. Accuracy-only leaderboards have locked developers into building models that bluff, because bluffing "wins" under the current rules. This reframing resonates with my July 2024 piece, where I noted that hallucinations were often the price of speed - companies rushing half-baked models to market. But OpenAI's work shows that the deeper problem isn't haste, but misaligned incentives baked into the very fabric of AI evaluation. Remember that AI hallucinations aren't disappearing overnight. As OpenAI admits, accuracy will never hit 100-percent. A chatbot's polished tone is no guarantee of truth, because some questions are inherently unanswerable. But progress is possible if we stop grading models in ways that punish caution and reward fabrication. If OpenAI and others succeed in redesigning evaluations to reward humility, we should expect models to say "I don't know" more often. That will feel jarring at first - perhaps even frustrating. But in high-stakes contexts like healthcare or legal advice, a model that admits uncertainty is far safer than one that invents answers. Last year, I framed hallucinations as both a curse and a creative spark. That duality remains. Hallucinations can still inspire surreal art or imaginative leaps. But in day-to-day knowledge-based work, they remain landmines. As users, journalists, or policymakers, we must internalize this lesson. AI systems are powerful, but only when grounded in truth or transparent about uncertainty. Until then, treat your model like a clever but overconfident friend - insightful at times, unreliable at others, and always in need of a fact-check when they say something that feels too good to be true.
Share
Share
Copy Link
Recent studies reveal the root causes of AI hallucinations and propose solutions to improve model reliability. OpenAI's GPT-5 shows progress in reducing false information, but the problem remains inherent to large language models.
Artificial Intelligence (AI) models, particularly large language models (LLMs), continue to grapple with the issue of 'hallucinations' β confidently producing false or made-up information. Despite advancements in the field, this problem remains a significant concern for researchers and users alike
1
.Source: Digit
OpenAI's recently released GPT-5 suite of models claims to have reduced the frequency of hallucinations and other types of 'deceptions'
1
. On benchmarks testing citation-based responses, GPT-5 outperformed its predecessors. However, the model still struggles with technical fields such as law and mathematics, and users have found errors in basic tasks like creating timelines of U.S. presidents1
.Recent research from OpenAI sheds light on why LLMs hallucinate. The problem stems from the fundamental way these models work β as statistical machines that make predictions based on learned associations
2
. During training, LLMs are rewarded for producing plausible answers rather than acknowledging uncertainty, similar to a student guessing on a multiple-choice exam1
2
.Source: Geeky Gadgets
Researchers and developers are exploring various approaches to reduce AI hallucinations:
Improved Training Methods: OpenAI focused on training models to browse effectively for up-to-date information and reducing hallucinations in lengthy, open-ended responses
1
.Uncertainty-Aware Responses: Encouraging models to admit when they don't know an answer, rather than guessing
2
3
.Redesigning Evaluation Metrics: Shifting focus from accuracy-driven metrics to those that reward uncertainty-aware responses and penalize confident errors
4
.Related Stories
While systemic changes are being developed, users can take steps to mitigate the risk of AI hallucinations:
2
.Experts agree that completely eliminating hallucinations is likely impossible due to the statistical nature of LLMs
1
. However, ongoing research and development aim to significantly reduce their occurrence and improve model reliability. As the field progresses, we can expect to see more uncertainty-aware AI systems that prioritize truthfulness over confident guessing4
.Source: Decrypt
Summarized by
Navi
22 Mar 2025β’Technology
05 May 2025β’Technology
21 Apr 2025β’Technology
1
Business and Economy
2
Technology
3
Business and Economy