Curated by THEOUTPOST
On Sat, 28 Sept, 4:01 PM UTC
3 Sources
[1]
Sophisticated AI models are more likely to lie
Human feedback training may incentivize providing any answer -- even wrong ones. When a research team led by Amrit Kirpalani, a medical educator at Western University in Ontario, Canada, evaluated ChatGPT's performance in diagnosing medical cases back in August 2024, one of the things that surprised them was the AI's propensity to give well-structured, eloquent but blatantly wrong answers. Now, in a study recently published in Nature, a different group of researchers tried to explain why ChatGPT and other large language models tend to do this. "To speak confidently about things we do not know is a problem of humanity in a lot of ways. And large language models are imitations of humans," says Wout Schellaert, an AI researcher at the University of Valencia, Spain, and co-author of the paper. Early large language models like GPT-3 had a hard time answering simple questions about geography or science. They even struggled with performing simple math such as "how much is 20 +183." But in most cases where they couldn't identify the correct answer, they did what an honest human being would do: They avoided answering the question. The problem with the non-answers is that large language models were intended to be question-answering machines. For commercial companies like Open AI or Meta that were developing advanced LLMs, a question-answering machine that answered "I don't know" more than half the time was simply a bad product. So, they got busy solving this problem. The first thing they did was scale the models up. "Scaling up refers to two aspects of model development. One is increasing the size of the training data set, usually a collection of text from websites and books. The other is increasing the number of language parameters," says Schellaert. When you think about an LLM as a neural network, the number of parameters can be compared to the number of synapses connecting its neurons. LLMs like GPT-3 used absurd amounts of text data, exceeding 45 terabytes, for training. The number of parameters used by GPT-3 was north of 175 billion.
[2]
The Bigger the AI Model, the More It Will Lie - Decrypt
Researchers have found evidence that artificial intelligence models would rather lie than admit the shame of not knowing something. This behavior seems to be more apparent the more they grow in size and complexity. A new study published in Nature found that the bigger LLMs get, the less reliable they become for specific tasks. It's not exactly lying in the same way we perceive the word, but they tend to reply with confidence even if the answer is not factually correct, because they are trained to believe it is. This phenomenon, which researchers dubbed "ultra-crepidarian" -- a 19th century word that basically means expressing an opinion about something you know nothing about -- describes LLMs venturing far beyond their knowledge base to provide responses. "[LLMs are] failing proportionally more when they do not know, yet still answering," the study noted. In other words, the models are unaware of their own ignorance. The study, which examined the performance of several LLM families, including OpenAI's GPT series, Meta's LLaMA models, and the BLOOM suite from BigScience, highlights a disconnect between increasing model capabilities and reliable real-world performance. While larger LLMs generally demonstrate improved performance on complex tasks, this improvement doesn't necessarily translate to consistent accuracy, especially on simpler tasks. This "difficulty discordance" -- the phenomenon of LLMs failing on tasks that humans perceive as easy -- undermines the idea of a reliable operating area for these models. Even with increasingly sophisticated training methods, including scaling up model size and data volume and shaping up models with human feedback, researchers have yet to find a guaranteed way to eliminate this discordance. The study's findings fly in the face of conventional wisdom about AI development. Traditionally, it was thought that increasing a model's size, data volume, and computational power would lead to more accurate and trustworthy outputs. However, the research suggests that scaling up may actually exacerbate reliability issues. Larger models demonstrate a marked decrease in task avoidance, meaning they're less likely to shy away from difficult questions. While this might seem like a positive development at first glance, it comes with a significant downside: these models are also more prone to giving incorrect answers. In the graph below, it's easy to see how models throw incorrect results (red) instead of avoiding the task (light blue). Correct answers appear in dark blue. "Scaling and shaping currently exchange avoidance for more incorrectness," researchers noted, but fixing this issue is not as easy as training a model to be more cautious. "Avoidance is clearly much lower for shaped-up models, but incorrectness is much higher," the researchers stated. However, a model that is trained to avoid executing tasks may end up becoming lazier or nerfed -- as users have noted in different top-rated LLMs like ChatGPT or Claude. Researchers found that this phenomenon is not because bigger LLMs are not capable of excelling at simple tasks, but instead they are trained to be more proficient at complex tasks. It's like a person who's used to eating only gourmet meals suddenly struggling to make a home barbecue or a traditional cake. AI models trained on vast, complex datasets are more prone to miss fundamental skills. The issue is compounded by the models' apparent confidence. Users often find it challenging to discern when an AI is providing accurate information versus when it's confidently spouting misinformation. This overconfidence can lead to dangerous over-reliance on AI outputs, particularly in critical fields like healthcare or legal advice. Researchers also noted that the reliability of scaled-up models fluctuates across different domains. While performance might improve in one area, it could simultaneously degrade in another, creating a whack-a-mole effect that makes it difficult to establish any "safe" areas of operation. "The percentage of avoidant answers rarely rises quicker than the percentage of incorrect ones. The reading is clear: errors still become more frequent. This represents an involution in reliability," the researchers wrote. The study highlights the limitations of current AI training methods. Techniques like reinforcement learning with human feedback (RLHF), intended to shape AI behavior, may actually be exacerbating the problem. These approaches appear to be reducing the models' tendency to avoid tasks they're not equipped to handle -- remember the infamous "as an AI Language Model I cannot?" -- inadvertently encouraging more frequent errors. Prompt engineering, the art of crafting effective queries for AI systems, seems to be a key skill to counter these issues. Even highly advanced models like GPT-4 exhibit sensitivity to how questions are phrased, with slight variations potentially leading to drastically different outputs. This is easier to note when comparing different LLM families: For example, Claude 3.5 Sonnet requires a whole different prompting style than OpenAI o1 to achieve the best results. Improper prompts may end up making a model more or less prone to hallucinate. Human oversight, long considered a safeguard against AI mistakes, may not be sufficient to address these issues. The study found that users often struggle to correct incorrect model outputs, even in relatively simple domains, so relying on human judgment as a fail-safe may not be the ultimate solution for proper model training. "Users can recognize these high-difficulty instances but still make frequent incorrect-to-correct supervision errors," the researchers observed. The study's findings call into question the current trajectory of AI development. While the push for larger, more capable models continues, this research suggests that bigger isn't always better when it comes to AI reliability. And right now, companies are focusing on better data quality than quantity. For example, Meta's latest Llama 3.2 models achieve better results than previous generations trained on more parameters. Luckily, this makes them less human, so they can admit defeat when you ask them the most basic thing in the world to make them look dumb.
[3]
The Most Sophisticated AIs Are Most Likely to Lie, Worrying Research Finds
Beware the smart ones: they seem to have all the answers, but can also weave the most convincing lies. It seems that this logic also applies to large language models, which are becoming more powerful with each iteration. New research suggests that this smarter crop of AI chatbots are actually becoming less trustworthy, because they're more likely to make up facts rather than avoiding or turning down questions they can't answer. The study, published in the journal Nature, examined some of the leading commercial LLMs in the industry: OpenAI's GPT, and Meta's LLaMA, along with an open source model called BLOOM created by the research group BigScience. While it found that their responses are in many cases becoming more accurate, they were across the board less reliable, giving a higher proportion of wrong answers than older models did. "They are answering almost everything these days. And that means more correct, but also more incorrect [answers]," study coauthor José Hernández-Orallo, a researcher at the Valencian Research Institute for Artificial Intelligence in Spain, told Nature. Mike Hicks, a philosopher of science and technology at the University of Glasgow, had a harsher assessment. "That looks to me like what we would call bullshitting," Hicks, who was not involved in the study, told Nature. "It's getting better at pretending to be knowledgeable." The models were quizzed on topics ranging from math to geography, and were also asked to perform tasks like listing information in a specified order. The bigger, more powerful models gave the most accurate responses overall, but faltered at harder questions, for which they had a lower accuracy. According to the researchers, some of the biggest BS-ers were OpenAI's GPT-4 and o1, which would answer almost any question thrown at them. But all of the studied LLMs appear to be trending this way, and for the LLaMA family of models, none of them could reach a level of 60 percent accuracy for the easiest questions, the study said. In sum, the bigger the AI models got -- in terms of parameters, training data, and other factors -- the bigger the percentage of wrong answers they gave. Still, AI models are getting better at answering more complex questions. The problem, other than their propensity for BS-ing, is that they still mess up the easy ones. In theory, these errors should be a bigger red flag, but because we're impressed at how the large language models handle sophisticated problems, we may be overlooking their obvious flaws, the researchers suggest. As such, the work had some sobering implications about how humans perceive the AI responses. When asked to judge if the chatbots' answers were accurate or inaccurate, a select group of participants got it wrong between 10 to 40 percent of the time. The simplest way to combat the issues, according to the researchers, is to program the LLMs to be less eager to answer everything. "You can put a threshold, and when the question is challenging, [get the chatbot to] say, 'no, I don't know,'" Hernández-Orallo told Nature. But honesty may not be in the best interests of AI companies looking to woo the public with their fancy tech. If chatbots were reined in to answer only stuff they knew about, it might expose the limits of the technology.
Share
Share
Copy Link
Recent studies reveal that as AI language models grow in size and sophistication, they become more likely to provide incorrect information confidently, raising concerns about reliability and the need for improved training methods.
Recent research published in Nature has uncovered a concerning trend in the development of large language models (LLMs): as these AI systems grow in size and complexity, they become increasingly prone to providing incorrect information with high confidence. This phenomenon, dubbed "ultra-crepidarian" behavior, describes the tendency of advanced AI models to venture beyond their knowledge base, often resulting in eloquent but factually incorrect responses 1.
Early LLMs like GPT-3 often avoided answering questions they couldn't confidently address. However, as AI companies sought to improve their products, they focused on scaling up models by increasing training data and parameters. This approach, while enhancing performance on complex tasks, has led to an unexpected consequence: a decrease in task avoidance coupled with an increase in incorrect answers 2.
One of the most troubling aspects of this development is the apparent confidence with which larger models provide incorrect information. This overconfidence can lead to dangerous over-reliance on AI outputs, particularly in critical fields such as healthcare or legal advice. The study found that even highly advanced models like GPT-4 and o1 would answer almost any question, regardless of their actual knowledge on the subject 3.
The findings challenge the conventional wisdom that increasing model size and data volume necessarily leads to more accurate and trustworthy outputs. Instead, researchers observed a "difficulty discordance" where LLMs fail on tasks that humans perceive as easy, undermining the idea of a reliable operating area for these models 2.
The study also revealed limitations in human ability to discern AI errors. Participants tasked with judging the accuracy of AI responses were wrong 10 to 40 percent of the time, highlighting the challenges of relying on human oversight as a safeguard against AI mistakes 3.
Researchers suggest that one approach to mitigate these issues could be to program LLMs to be less eager to answer everything, implementing thresholds that prompt the AI to admit when it doesn't know something. However, this solution may conflict with the commercial interests of AI companies seeking to showcase their technology's capabilities 3.
As the AI field continues to evolve, addressing the balance between model performance and reliability remains a critical challenge. The study's findings underscore the need for new approaches in AI development that prioritize accuracy and trustworthiness alongside raw capabilities.
Reference
[1]
Recent research reveals that while larger AI language models demonstrate enhanced capabilities in answering questions, they also exhibit a concerning trend of increased confidence in incorrect responses. This phenomenon raises important questions about the development and deployment of advanced AI systems.
5 Sources
5 Sources
Researchers warn that the proliferation of AI-generated web content could lead to a decline in the accuracy and reliability of large language models (LLMs). This phenomenon, dubbed "model collapse," poses significant challenges for the future of AI development and its applications.
8 Sources
8 Sources
Recent studies by Anthropic and other researchers uncover concerning behaviors in advanced AI models, including strategic deception and resistance to retraining, raising significant questions about AI safety and control.
6 Sources
6 Sources
A recent study by Apple researchers exposes significant flaws in the mathematical reasoning capabilities of large language models (LLMs), challenging the notion of AI's advanced reasoning skills and raising questions about their real-world applications.
17 Sources
17 Sources
Computer scientists are working on innovative approaches to enhance the factual accuracy of AI-generated information, including confidence scoring systems and cross-referencing with reliable sources.
2 Sources
2 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved