4 Sources
[1]
Anthropic CEO claims AI models hallucinate less than humans | TechCrunch
Anthropic CEO Dario Amodei believes today's AI models hallucinate, or make things up and present them as if they're true, at a lower rate than humans do, he said during a press briefing at Anthropic's first developer event, Code with Claude, in San Francisco on Thursday. Amodei said all this in the midst of a larger point he was making: that AI hallucinations are not a limitation on Anthropic's path to AGI -- AI systems with human-level intelligence or better. "It really depends how you measure it, but I suspect that AI models probably hallucinate less than humans, but they hallucinate in more surprising ways," Amodei said, responding to TechCrunch's question. Anthropic's CEO is one of the most bullish leaders in the industry on the prospect of AI models achieving AGI. In a widely circulated paper he wrote last year, Amodei said he believed AGI could arrive as soon as 2026. During Thursday's press briefing, the Anthropic CEO said he was seeing steady progress to that end, noting that "the water is rising everywhere." "Everyone's always looking for these hard blocks on what [AI] can do," said Amodei. "They're nowhere to be seen. There's no such thing." Other AI leaders believe hallucination presents a large obstacle to achieving AGI. Earlier this week, Google DeepMind CEO Demis Hassabis said today's AI models have too many 'holes,' and get too many obvious questions wrong. For example, earlier this month, a lawyer representing Anthropic was forced to apologize in court after they used Claude to create citations in a court filing, and the AI chatbot hallucinated and got names and titles wrong. It's difficult to verify Amodei's claim, largely because most hallucination benchmarks pit AI models against each other; they don't compare models to humans. Certain techniques seem to be helping lower hallucination rates, such as giving AI models access to web search. Separately, some AI models, such as OpenAI's GPT-4.5, have notably lower hallucination rates on benchmarks compared to early generations of systems. However, there's also evidence to suggest hallucinations are actually getting worse in advanced reasoning AI models. OpenAI's o3 and o4-mini models have higher hallucination rates than OpenAI's previous-gen reasoning models, and the company doesn't really understand why. Later in the press briefing, Amodei pointed out that TV broadcasters, politicians, and humans in all types of professions make mistakes all the time. The fact that AI makes mistakes too is not a knock on its intelligence, according to Amodei. However, Anthropic's CEO acknowledged the confidence with which AI models present untrue things as facts might be a problem. In fact, Anthropic has done a fair amount of research on the tendency for AI models to deceive humans, a problem that seemed especially prevalent in the company's recently launched Claude Opus 4. Apollo Research, a safety institute given early access to test the AI model, found that an early version of Claude Opus 4 exhibited a high tendency to scheme against humans and deceive them. Apollo went as far as to suggest Anthropic shouldn't have released that early model. Anthropic said it came up with some mitigations that appeared to address the issues Apollo raised. Amodei's comments suggest that Anthropic may consider an AI model to be AGI, or equal to human-level intelligence, even if it still hallucinates. An AI that hallucinates may fall short of AGI by many people's definition, though.
[2]
AI might be hallucinating less than humans do
Anthropic CEO Dario Amodei stated that current AI models hallucinate at a lower rate than humans do, according to TechCrunch. He made this claim during a press briefing at Anthropic's inaugural developer event, Code with Claude, held in San Francisco on Thursday, amidst a broader discussion about the path towards Artificial General Intelligence (AGI). Amodei's comments were delivered in response to a question from TechCrunch regarding AI hallucinations, which refer to instances where AI models generate incorrect or fabricated information presented as factual. He acknowledged that while AI models do hallucinate, "it really depends how you measure it," but speculated that they likely do so less frequently than humans. He further elaborated that AI hallucinations manifest in "surprising ways," and that addressing this issue is not necessarily a barrier to achieving AGI -- systems possessing human-level intelligence or beyond. Amodei noted a paper he wrote last year, expressing his belief that AGI could arrive as early as 2026, and that while progress has been steady, there are always unexpected challenges. "We're seeing a lot of progress," Amodei stated, suggesting that the company is focused on addressing the limitations of AI without viewing hallucinations as a fundamental impediment. His comments are part of a wider discussion on the challenges and possibilities surrounding AI development and the pursuit of AGI. The opening keynote of Code with Claude can be watched here:
[3]
Anthropic CEO Believes AI Models Hallucinate Less Than Humans
Anthropic has released several papers on ways AI models can be grounded Anthropic CEO Dario Amodei reportedly said that artificial intelligence (AI) models hallucinate less than humans. As per the report, the statement was made by the CEO at the company's inaugural Code With Claude event on Thursday. During the event, the San Francisco-based AI firm released two new Claude 4 models, as well as multiple new capabilities, including improved memory and tool use. Amodei reportedly also suggested that while critics are trying to find roadblocks for AI, "they are nowhere to be seen." TechCrunch reports that Amodei's made the comment during a press briefing, while he was explaining how hallucinations are not a limitation for AI to reach artificial general intelligence (AGI). Answering a question from the publication, the CEO reportedly said, "It really depends how you measure it, but I suspect that AI models probably hallucinate less than humans, but they hallucinate in more surprising ways." Amodei further added that TV broadcasters, politicians, and humans involved in other professions make mistakes regularly, so AI making mistakes does not take away from its intelligence, as per the report. However, the CEO reportedly acknowledged that AI models confidently responding with untrue responses is a problem. Earlier this month, Anthropic's lawyer was forced to apologise in a courtroom after its Claude chatbot added an incorrect citation in a filing, according to a Bloomberg report. The incident occurred during the AI firm's ongoing lawsuit against music publishers over alleged copyright infringement of lyrics of at least 500 songs. In a October 2024 paper, Amodei claimed that Anthropic might achieve AGI as soon as next year. AGI refers to a type of AI technology that can understand, learn, and apply knowledge across a wide range of tasks and execute actions without requiring human intervention. As part of its vision, Anthropic released Claude Opus 4 and Claude Sonnet 4 during the developer conference. These models bring major improvements in coding, tool use, and writing. Claude Sonnet 4 scored 72.7 percent on the SWE-Bench benchmark, achieving state-of-the-art (SOTA) distinction in code writing.
[4]
AI models may hallucinate less than humans in factual tasks, says Anthropic CEO: Report
The new Claude 4 series represents a step forward in Anthropic's pursuit of artificial general intelligence (AGI). The company said the upgrades include improved long-term memory, better code generation, enhanced tool use, and stronger writing capabilities. Claude Sonnet 4 achieved a 72.7% score on the SWE-Bench benchmark, which evaluates AI coding agents on their ability to solve real-world software engineering problems, setting a new performance record for AI systems in this domain.At two prominent tech events, VivaTech 2025 in Paris and Anthropic's Code With Claude developer day, Anthropic chief executive officer Dario Amodei made a provocative claim: artificial intelligence models may now hallucinate less frequently than humans in well-defined factual scenarios. Speaking at both events, Amodei said recent internal tests showed that the company's latest Claude 3.5 model had outperformed humans on structured factual quizzes. This challenges a long-held criticism of generative AI, which is that models often "hallucinate" or generate incorrect information with undue confidence. "If you define hallucination as confidently saying something that's wrong, humans do that a lot," Amodei said at VivaTech. He added that Claude models had consistently provided more accurate answers than human participants in verifiable question formats. At Code With Claude, where the company also launched its new Claude Opus 4 and Claude Sonnet 4 models, Amodei reiterated his view. According to a TechCrunch report, he told attendees, "It really depends on how you measure it, but I suspect that AI models probably hallucinate less than humans, but they hallucinate in more surprising ways." The new Claude 4 series represents a step forward in Anthropic's pursuit of artificial general intelligence (AGI). The company said the upgrades include improved long-term memory, better code generation, enhanced tool use, and stronger writing capabilities. Claude Sonnet 4 achieved a 72.7% score on the SWE-Bench benchmark, which evaluates AI coding agents on their ability to solve real-world software engineering problems, setting a new performance record for AI systems in this domain. Despite these gains, Amodei acknowledged that hallucinations have not been eliminated. He highlighted the importance of prompt phrasing and use-case design, especially in high-risk domains such as legal or healthcare applications. The remarks follow a recent courtroom episode in which Anthropic's Claude chatbot generated a false citation in a legal filing involving music publishers. The company's legal team later issued an apology, reinforcing the need for improved accuracy in sensitive settings. Amodei also called for the development of standardised metrics across the industry to evaluate hallucination rates. "You can't fix what you don't measure precisely," he said.
Share
Copy Link
Anthropic's CEO Dario Amodei claims AI models may hallucinate less than humans, challenging common perceptions about AI limitations and reigniting discussions on the path to Artificial General Intelligence (AGI).
Dario Amodei, CEO of Anthropic, has stirred controversy in the AI community by claiming that current AI models may hallucinate less frequently than humans, particularly in well-defined factual scenarios. This assertion was made during Anthropic's inaugural developer event, Code with Claude, in San Francisco and at VivaTech 2025 in Paris 14.
Source: Economic Times
Amodei stated, "It really depends how you measure it, but I suspect that AI models probably hallucinate less than humans, but they hallucinate in more surprising ways" 1. He further elaborated that addressing hallucinations is not necessarily a barrier to achieving Artificial General Intelligence (AGI) 2.
AI hallucinations refer to instances where AI models generate incorrect or fabricated information and present it as factual. This has been a significant concern in the AI community, with many viewing it as a major obstacle to achieving AGI 1.
Source: NDTV Gadgets 360
Amodei's comments come in the wake of a recent incident where Anthropic's AI chatbot, Claude, generated a false citation in a legal filing, leading to an apology from the company's legal team 3. This incident highlights the ongoing challenges in ensuring AI accuracy, especially in sensitive domains like law and healthcare.
During the Code with Claude event, Anthropic unveiled two new models: Claude Opus 4 and Claude Sonnet 4. These models represent significant advancements in the company's AI capabilities 34:
Notably, Claude Sonnet 4 achieved a 72.7% score on the SWE-Bench benchmark, setting a new performance record for AI systems in solving real-world software engineering problems 4.
Amodei's claims have reignited discussions about the path to AGI and the current limitations of AI systems. While some AI leaders, like Google DeepMind CEO Demis Hassabis, believe that hallucinations present a significant obstacle to achieving AGI, Amodei sees steady progress towards this goal 1.
The Anthropic CEO has previously stated his belief that AGI could arrive as early as 2026, and he maintains that there are no insurmountable blocks to AI capabilities 1. However, this optimistic view is not universally shared within the AI community.
Despite the claimed improvements in AI accuracy, Amodei acknowledges that hallucinations have not been eliminated entirely. He emphasizes the importance of prompt phrasing and use-case design, particularly in high-risk domains 4.
Amodei has also called for the development of standardized metrics across the industry to evaluate hallucination rates, stating, "You can't fix what you don't measure precisely" 4. This highlights the need for more robust and consistent evaluation methods in AI research and development.
As the debate continues, the AI community remains divided on the true extent of AI hallucinations and their implications for the development of AGI. Anthropic's bold claims and rapid advancements in AI capabilities are sure to fuel further discussion and research in this critical area of AI development.
Chinese AI firms are circumventing US chip export controls by renting data centers in countries like Malaysia, training AI models on high-end chips, and transporting data via hard drives.
2 Sources
Technology
22 hrs ago
2 Sources
Technology
22 hrs ago
BT's CEO Allison Kirkby suggests that advancements in AI could lead to more significant job cuts than previously announced, potentially reshaping the company's workforce by the end of the decade.
6 Sources
Business and Economy
22 hrs ago
6 Sources
Business and Economy
22 hrs ago
AstraZeneca signs a strategic collaboration with China's CSPC Pharmaceuticals, leveraging AI technology for drug discovery and development in chronic diseases, in a deal worth up to $5.2 billion.
5 Sources
Business and Economy
2 days ago
5 Sources
Business and Economy
2 days ago
China has released draft guidelines to regulate the export of data generated by cars, potentially affecting companies like Tesla. The rules outline scenarios requiring security assessments for data transfers abroad, particularly for autonomous driving and advanced driving assistance systems.
2 Sources
Policy and Regulation
2 days ago
2 Sources
Policy and Regulation
2 days ago
Prime Video introduces an AI-powered 'Burn Bar' tool that measures fuel usage in NASCAR races, offering viewers unprecedented insights into race strategy and performance.
2 Sources
Technology
1 day ago
2 Sources
Technology
1 day ago