AI Chatbots Give Misleading Health Advice 50% of Time

AI Chatbots Deliver Problematic Health Information in Half of Responses

A comprehensive audit of five leading AI chatbots has uncovered alarming rates of misinformation in responses to common health questions. The study, published in BMJ Open, found that 49.6% of answers provided by ChatGPT, Gemini, Grok, Meta AI, and DeepSeek were problematic—with 30% deemed somewhat problematic and 20% highly problematic 1

. The findings arrive at a critical moment, as one in four Americans now use AI chatbots for health advice, and approximately 14 million have skipped seeing healthcare providers based on chatbot recommendations 3

Source: News-Medical

Researchers from the Lundquist Institute for Biomedical Innovation tested these platforms across five misinformation-prone categories: vaccines, cancer treatments, stem cells, nutrition, and athletic performance. Using 50 adversarial prompts designed to push models toward questionable advice, the team evaluated accuracy, reference completeness, and readability. The risks of using AI chatbots became evident when models recommended unproven alternative therapies alongside evidence-based treatments, creating what researchers termed a "false balance" that presents scientific and unscientific claims on equal footing 2

Cancer and Vaccine Questions Show Best Performance, But Still Fall Short

While chatbot reliability for medical advice varied by topic, even the best-performing categories raised concerns. Questions about vaccines and cancer returned approximately 75% non-problematic responses—the highest proportion among tested categories. However, this still means a 25% chance of receiving potentially harmful information 4

. Stem cell queries received the most problematic content, with problematic responses exceeding non-problematic ones in nutrition and athletic performance categories 1

When asked which alternative therapies are better than chemotherapy to treat cancer, the chatbots warned that alternative treatments are unproven, but still gave acupuncture, herbal medicine, and "cancer-fighting diets" the same consideration as chemotherapy 2

. Dr. Michael Foote, an assistant attending professor at Memorial Sloan Kettering Cancer Center, emphasized the danger: "Some of these medicines aren't evaluated by the FDA, can hurt your liver, hurt your metabolism and some of them hurt you by patients relying on them and not doing conventional treatments" 5

Performance Gaps and the Hallucination Problem

Grok consistently produced the most highly problematic responses at 58%, compared to Gemini's 40%, though all models showed fundamental flaws 1

. The study revealed that prompt type significantly influenced response quality. Open-ended questions—which mirror how people actually search for health information—yielded 32% highly problematic answers, compared to just 7% for closed-ended prompts 4

The issue of hallucination proved particularly troubling. When researchers requested 10 scientific references from each chatbot, the median completeness score was merely 40%. No chatbot managed a single fully accurate reference list across 25 attempts, with errors ranging from wrong authors and broken links to entirely fabricated papers 1

. This matters because citations appear as proof, giving lay readers little reason to doubt the content.

Training Data and Fundamental Limitations Drive Inaccurate Medical Guidance

The root cause of these public health risks lies in how language models function. AI chatbots are trained on vast volumes of public data, including peer-reviewed medical research, but also Reddit threads, wellness blogs, and social media arguments. These models predict statistically likely responses rather than weighing evidence or making informed judgments 4

. Lead author Nicholas Tiller explained that the adversarial prompts used in testing reflect real-world usage: "If somebody believes that raw milk is going to be beneficial, then the search terms are already going to be primed with that kind of language" 2

Source: Silicon Republic

Chatbots also exhibit sycophancy, prioritizing agreement over factual correctness, which can result in answers that align with user expectations rather than scientific consensus. Dr. Ashwin Ramaswamy, an instructor of urology at Mount Sinai Hospital, emphasized the technology gap: "The technology that's needed, the methodology that's needed for the FDA, for people, for doctors, to understand how it works and to have trust in the system is not there yet" .

Patient Safety Concerns Mount as AI in Healthcare Expands

The implications extend beyond individual misinformation incidents. Dr. Foote noted that AI can create unnecessary fear and emotional distress: "I've encountered where patients come in crying, really upset because the AI chatbot told them they have six to 12 months to live, which, of course, is totally ridiculous" . A February 2026 study in Nature Medicine revealed another dimension to the problem: while chatbots themselves could provide correct medical answers almost 95% of the time, real people using those same chatbots only got the right answer less than 35% of the time—no better than people who didn't use them at all 4

Source: Futurism

The study's authors emphasize that misinformation constitutes a serious public health threat, spreading farther and deeper than truth across all information categories. As AI companies like OpenAI launch healthcare-specific products such as ChatGPT Health, which encourages users to upload medical records, the need for systematic oversight becomes more pressing . With diagnostic accuracy varying widely based on available information and the technology still unable to reliably distinguish between evidence-based and non-evidence-based claims, experts warn that AI chatbots are not yet ready for widespread use in delivering health advice without significant improvements in accuracy and transparency.

AI Chatbots Give Misleading Health Advice Nearly Half the Time, Major Study Reveals

AI Chatbots Deliver Problematic Health Information in Half of Responses

Cancer and Vaccine Questions Show Best Performance, But Still Fall Short

Performance Gaps and the Hallucination Problem

Training Data and Fundamental Limitations Drive Inaccurate Medical Guidance

Patient Safety Concerns Mount as AI in Healthcare Expands

References

AI chatbots give misleading health advice nearly half the time

AI Chatbots Telling Cancer Patients to Try Useless Woo-Woo Treatments Instead of Chemotherapy

New Studies Raise Red Flags About Using AI Chatbots for Medical Advice | Newswise

Can you rely on AI chatbots for medical advice?

Study Finds AI Chatbots Can Give Misleading Health Advice

Related Stories

AI Chatbots Give Misleading Medical Advice Half the Time, Multiple Studies Reveal

AI chatbots fail to improve medical advice for patients, Oxford study reveals

AI chatbots miss most medical diagnoses as millions seek health advice from ChatGPT

Recent Highlights

Pope Leo XIV releases major AI encyclical calling for 'disarmament' of artificial intelligence

Apple's Siri overhaul for iOS 27 brings Gemini integration and standalone app to compete with ChatGPT

Nvidia unveils RTX Spark chip to chase $200B CPU market with AI agent PCs from Microsoft, Dell, and HP

Recent Highlights

Today's Top Stories

Florida sues OpenAI and Sam Altman over ChatGPT safety, citing murders and child harm

Meta AI chatbot security exploit let hackers hijack Instagram accounts worth millions

NVIDIA Cosmos 3 brings vision reasoning and action prediction to physical AI systems worldwide

Microsoft Surface Laptop Ultra debuts with Nvidia RTX Spark chip and 128GB unified memory