3 Sources
3 Sources
[1]
AI no better than other methods for patients seeking medical advice, study shows
LONDON, Feb 9 (Reuters) - Asking AI about medical symptoms does not help patients make better decisions about their health than other methods, such as a standard internet search, according to a new study published in Nature Medicine. The authors said the study was important as people were increasingly turning to AI and chatbots for advice on their health, but without evidence that this was necessarily the best and safest approach. Researchers led by the University of Oxford's Internet Institute worked alongside a group of doctors to draw up 10 different medical scenarios, ranging from a common cold to a life-threatening haemorrhage causing bleeding on the brain. When tested, opens new tabwithout human participants, three large-language models - Open AI's Chat GPT-4o, Meta's Llama 3 and Cohere's Command R+ - identified the conditions in 94.9% of cases, and chose the correct course of action, like calling an ambulance or going to the doctor, in an average of 56.3% of cases. The companies did not respond to requests for comment. 'HUGE GAP' BETWEEN AI'S POTENTIAL AND ACTUAL PERFORMANCE The researchers then recruited 1,298 participants in Britain to either use AI, or their usual resources like an internet search, or their experience, or the National Health Service website to investigate the symptoms and decide their next step. When the participants did this, relevant conditions were identified in less than 34.5% of cases, and the right course of action was given in less than 44.2%, no better than the control group using more traditional tools. Adam Mahdi, co-author of the paper and associate professor at Oxford, said the study showed the "huge gap" between the potential of AI and the pitfalls when it was used by people. "The knowledge may be in those bots; however, this knowledge doesn't always translate when interacting with humans," he said, meaning that more work was needed to identify why this was happening. HUMANS OFTEN GIVING INCOMPLETE INFORMATION The team studied around 30 of the interactions in detail, and concluded that often humans were providing incomplete or wrong information, but the LLMs were also sometimes generating misleading or incorrect responses. For example, one patient reporting the symptoms of a subarachnoid haemhorrhage - a life-threatening condition causing bleeding on the brain - was correctly told by AI to go to hospital after describing a stiff neck, light sensitivity and the "worst headache ever". The other described the same symptoms but a "terrible" headache, and was told to lie down in a darkened room. The team now plans a similar study in different countries and languages, and over time, to test if that impacts AI's performance. The study was supported by the data company Prolific, the German non-profit Dieter Schwarz Stiftung, and the UK and U.S. governments. Reporting by Jennifer Rigby; Additional reporting by Supantha Mukherjee; Editing by David Holmes Our Standards: The Thomson Reuters Trust Principles., opens new tab * Suggested Topics: * Healthcare & Pharmaceuticals Jennifer Rigby Thomson Reuters Jen is the Global Health Correspondent at Reuters, covering everything from pandemics to the rise of obesity worldwide. Since joining the news agency in 2022, her award-winning work includes coverage of gender-affirming care for adolescents in the UK and a global investigation with colleagues into how contaminated cough syrup killed hundreds of children in Africa and Asia. She previously worked at the Telegraph newspaper and Channel 4 News in the UK, and spent time as a freelancer in Myanmar and the Czech Republic.
[2]
AI chatbots give inaccurate medical advice says Oxford Uni study
AI chatbots give inaccurate and inconsistent medical advice that could present risks to users, according to a study from the University of Oxford. The research found people using AI for healthcare advice were given a mix of good and bad responses, making it hard to identify what advice they should trust. In November 2025, polling by Mental Health UK found more than one in three UK residents now use AI to support their mental health or wellbeing. Dr Rebecca Payne, lead medical practitioner on the study, said it could be "dangerous" for people to ask chatbots about their symptoms. Researchers gave 1,300 people a scenario, such as having a severe headache or being a new mother who felt constantly exhausted. They were split into two groups, with one using AI to help them figure out what they might have and decide what to do next. The researchers then evaluated whether people correctly identified what might be wrong, and if they should see a GP or go to A&E. They said the people who used AI often did not know what to ask, and were given a variety of different answers depending on how they worded their question. The chatbot responded with a mixture of information, and people found it hard to distinguish between what was useful and what was not. Dr Adam Mahdi, senior author on the study, told the BBC while AI was able to give medical information, people "struggle to get useful advice from it". "People share information gradually", he said. "They leave things out, they don't mention everything. So, in our study, when the AI listed three possible conditions, people were left to guess which of those can fit. "This is exactly when things would fall apart." Lead author Andrew Bean said the analysis illustrated how interacting with humans poses a challenge "even for top" AI models. "We hope this work will contribute to the development of safer and more useful AI systems," he said. Meanwhile Dr Bertalan MeskΓ³, editor of The Medical Futurist, which predicts tech trends in healthcare, said there were developments coming in the space. He said two major AI developers, OpenAI and Anthropic, had released health-dedicated versions of their general chatbot recently, which he believed would "definitely yield different results in a similar study". He said the goal should be to "to keep on improving" the tech, especially "health-related versions, with clear national regulations, regulatory guardrails and medical guidelines". Sign up for our Tech Decoded newsletter to follow the world's top tech stories and trends. Outside the UK? Sign up here.
[3]
AI chatbots give bad health advice, research finds - The Korea Times
PARIS -- Next time you're considering consulting Dr. ChatGPT, perhaps think again. Despite now being able to ace most medical licensing exams, artificial intelligence chatbots do not give humans better health advice than they can find using more traditional methods, according to a study published on Monday. "Despite all the hype, AI just isn't ready to take on the role of the physician," study co-author Rebecca Payne from Oxford University said. "Patients need to be aware that asking a large language model about their symptoms can be dangerous, giving wrong diagnoses and failing to recognise when urgent help is needed," she added in a statement. The British-led team of researchers wanted to find out how successful humans are when they use chatbots to identify their health problems and whether they require seeing a doctor or going to hospital. The team presented nearly 1,300 UK-based participants with 10 different scenarios, such as a headache after a night out drinking, a new mother feeling exhausted or what having gallstones feels like. Then the researchers randomly assigned the participants one of three chatbots: OpenAI's GPT-4o, Meta's Llama 3 or Command R+. There was also a control group that used internet search engines. People using the AI chatbots were only able to identify their health problem around a third of the time, while only around 45 percent figured out the right course of action. This was no better than the control group, according to the study, published in the Nature Medicine journal. Communication breakdown The researchers pointed out the disparity between these disappointing results and how AI chatbots score extremely highly on medical benchmarks and exams, blaming the gap on a communication breakdown. Unlike the simulated patient interactions often used to test AI, the real humans often did not give the chatbots all the relevant information. And sometimes the humans struggled to interpret the options offered by the chatbot, or misunderstood or simply ignored its advice. One out of every six U.S. adults ask AI chatbots about health information at least once a month, the researchers said, with that number expected to increase as more people adopt the new technology. "This is a very important study as it highlights the real medical risks posed to the public by chatbots," David Shaw, a bioethicist at Maastricht University in the Netherlands who was not involved in the research, told AFP. He advised people to only trust medical information from reliable sources, such as the UK's National Health Service.
Share
Share
Copy Link
A University of Oxford study published in Nature Medicine found that AI chatbots like GPT-4o and Llama 3 provide medical advice no better than traditional internet searches. When 1,298 participants used AI for health guidance, they correctly identified conditions only 34.5% of the time and chose the right action in just 44.2% of cases, revealing a significant gap between AI's potential and real-world performance.
A groundbreaking University of Oxford AI study published in Nature Medicine has revealed that AI chatbots provide medical advice no better than conventional internet searches, challenging the growing trend of patients turning to artificial intelligence for health guidance
1
. The research, led by Oxford's Internet Institute alongside medical practitioners, tested three prominent large language models in healthcare: OpenAI's GPT-4o, Meta's Llama 3, and Cohere's Command R+1
. When tested in isolation, these Large Language Models (LLMs) demonstrated impressive diagnostic accuracy, identifying medical conditions in 94.9% of cases1
. However, the real-world application painted a starkly different picture.
Source: Korea Times
The study involved 1,298 human participants in Britain who were presented with 10 different medical scenarios, ranging from common ailments like headaches after drinking to life-threatening conditions such as subarachnoid haemorrhage causing bleeding on the brain
1
3
. When human participants used AI for patient medical advice, relevant conditions were identified in less than 34.5% of cases, and the correct course of action was determined in less than 44.2% of instancesβno better than the control group using traditional resources like internet search engines or the National Health Service website. Dr. Adam Mahdi, co-author and associate professor at Oxford, emphasized the "huge gap" between AI's potential and its practical performance, noting that "the knowledge may be in those bots; however, this knowledge doesn't always translate when interacting with humans"1
.The research uncovered AI chatbots inaccurate medical advice stemming from both human error and flawed AI responses. Detailed analysis of approximately 30 patient interactions revealed that humans often provided incomplete information, while the AI systems generated misleading responses in critical situations
1
. In one alarming example involving symptoms of a subarachnoid haemorrhage, a patient describing the "worst headache ever" received correct advice to go to the hospital, while another patient with identical symptoms but describing a "terrible" headache was told to simply lie down in a darkened room1
. Dr. Rebecca Payne, lead medical practitioner on the study, warned that asking chatbots about symptoms "can be dangerous, giving wrong diagnoses and failing to recognise when urgent help is needed"3
.Related Stories
The study identified specific patterns in how human interaction with AI breaks down during medical consultations. Dr. Mahdi explained that "people share information gradually" and "leave things out," creating confusion when AI listed multiple possible conditions
2
. Participants struggled to distinguish useful health information from irrelevant details, and the quality of responses varied dramatically based on how questions were worded2
. This communication breakdown represents a critical challenge for large language models in healthcare, even as these systems continue to ace medical licensing exams3
. The medical risks extend beyond individual cases: one in six U.S. adults now consult AI chatbots about health information at least once monthly, with adoption rates expected to climb3
. In the UK, polling by Mental Health UK found more than one in three residents use AI to support their mental health or wellbeing2
. The research team plans to conduct similar studies across different countries and languages to determine if cultural and linguistic factors impact AI's performance, while experts call for clear national regulations and medical guidelines to govern AI versus internet search for health queries1
2
.
Source: Reuters
Summarized by
Navi
1
Policy and Regulation

2
Technology

3
Technology
