AI Chatbots Vulnerable to Medical Misinformation, Study Reveals

3 Sources

Share

A study by Mount Sinai researchers finds that AI chatbots are prone to repeating and elaborating on false medical information, highlighting the need for stronger safeguards in healthcare AI applications.

AI Chatbots Vulnerable to Medical Misinformation

A groundbreaking study conducted by researchers at the Icahn School of Medicine at Mount Sinai has revealed a critical vulnerability in widely used AI chatbots when it comes to handling medical information. The study, published in the August 2 online issue of Communications Medicine, found that these AI tools are highly susceptible to repeating and elaborating on false medical information, raising significant concerns about their reliability in healthcare settings

1

.

Study Methodology and Findings

Source: Medical Xpress

Source: Medical Xpress

The research team, led by Dr. Mahmud Omar, created fictional patient scenarios containing fabricated medical terms such as made-up diseases, symptoms, or tests. These scenarios were then submitted to leading large language models for analysis

2

.

The results were alarming:

  1. Without any additional guidance, the chatbots routinely elaborated on fake medical details, confidently generating explanations about non-existent conditions and treatments.
  2. Hallucination rates (false information generation) ranged from 50% to 82% across six different AI chatbots

    3

    .

The Impact of a Simple Warning Prompt

In a second round of testing, the researchers added a one-line caution to the prompt, reminding the AI that the information provided might be inaccurate. This simple addition yielded promising results:

  1. The warning prompt significantly reduced hallucination rates, cutting errors nearly in half.
  2. For the best-performing AI, ChatGPT-4o, the hallucination rate dropped from around 50% to less than 25% when the caution was added

    3

    .

Implications for AI in Healthcare

Dr. Eyal Klang, Chief of Generative AI at Mount Sinai, emphasized the significance of these findings: "Even a single made-up term could trigger a detailed, decisive response based entirely on fiction. But we also found that the simple, well-timed safety reminder built into the prompt made an important difference"

1

.

The study underscores the critical need for stronger safeguards before AI tools can be trusted in healthcare. Dr. Girish N. Nadkarni, Chief AI Officer for the Mount Sinai Health System, stated, "The solution isn't to abandon AI in medicine, but to engineer tools that can spot dubious input, respond with caution, and ensure human oversight remains central"

2

.

Future Research and Applications

The research team plans to extend their study by:

  1. Applying the same approach to real, de-identified patient records.
  2. Testing more advanced safety prompts and retrieval tools.
  3. Using their "fake-term" method as a simple yet powerful tool for hospitals, tech developers, and regulators to stress-test AI systems before clinical use

    1

    .

This study serves as a crucial reminder of the challenges and opportunities in integrating AI into healthcare. While the potential benefits are significant, ensuring the safety and reliability of these tools remains paramount as the technology continues to evolve rapidly.

TheOutpost.ai

Your Daily Dose of Curated AI News

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

© 2025 Triveous Technologies Private Limited
Instagram logo
LinkedIn logo