ChatGPT Health missed half of medical emergencies in first independent safety evaluation

Reviewed byNidhi Govil

2 Sources

Share

A Mount Sinai study found ChatGPT Health failed to recognize 52% of genuine medical emergencies, telling patients with life-threatening conditions like respiratory failure to wait rather than seek immediate care. The AI tool remains on a waitlist as OpenAI works to improve safety features, even as over 40 million people already use ChatGPT for health advice.

ChatGPT Health Fails Critical Safety Evaluation

Researchers at the Icahn School of Medicine at Mount Sinai have conducted the first independent safety evaluation of OpenAI's ChatGPT Health assistant since its January 2026 launch, uncovering troubling gaps in the AI's ability to assess medical emergencies

1

. The study, published in Nature Medicine, tested the chatbot across 60 real medical scenarios and found that ChatGPT Health "under-triaged" 52% of emergency cases, directing patients with life-threatening conditions like diabetic ketoacidosis and respiratory failure to schedule appointments within 24-48 hours rather than urging them to visit the emergency department

2

.

Source: Digit

Source: Digit

"We wanted to answer a very basic but critical question: if someone is experiencing a real medical emergency and turns to ChatGPT Health for help, will it clearly tell them to go to the emergency room?" lead author and urologist Ashwin Ramaswamy explained

1

. The answer, most of the time, was no. In one respiratory failure case, the AI clearly identified the symptoms as an early warning sign but reassured the patient to wait and monitor instead of seeking emergency help

1

.

AI Healthcare Tools Struggle With Clinical Judgment

While the system correctly handled "textbook emergencies" like stroke and anaphylaxis, it faltered in nuanced situations where clinical judgment matters most. Dr. Ramaswamy told NBC News that conditions requiring immediate attention were missed medical emergencies that "any doctor, and any person who's gone through any degree of training, would say that patient needs to go to the emergency department"

2

. The inconsistency proved particularly unsettling: stroke symptoms were flagged as emergencies every time, yet a patient with a three-day sore throat received more urgent guidance than someone experiencing early respiratory failure

2

.

Source: Gizmodo

Source: Gizmodo

The triage process, which assesses the severity of a patient's condition, represents a critical first step in medical care. The Mount Sinai study revealed that OpenAI's ChatGPT Health chatbot for health advice was "inverted to clinical risk," providing more reliable alerts for lower-risk scenarios than for genuinely dangerous situations

2

.

Concerning Patterns in Mental Health Crisis Detection

The study uncovered another alarming issue: suicide-risk alerts appeared inconsistently across the platform. "The system's alerts were inverted relative to clinical risk, appearing more reliably for lower-risk scenarios than for cases when someone shared how they intended to hurt themselves," explained Mount Sinai Health System's chief AI officer Girish Nadkarni

1

. "In real life, when someone talks about exactly how they would harm themselves, that's a sign of more immediate and serious danger, not less. This was a particularly surprising and concerning finding."

1

These patient safety concerns come as OpenAI faces ongoing scrutiny over how its chatbots handle mental health episodes, with grieving families filing lawsuits alleging negligent behavior and insufficient safety guardrails that aided suicidal ideation in relatives

1

.

Growing Demand for AI for Health Advice Raises Stakes

OpenAI launched ChatGPT Health after reporting that more than 40 million people worldwide were already using the company's chatbot daily for health advice

1

. The company's research found that 7-in-10 healthcare-related conversations occurred outside normal clinic hours, with an average of more than 580,000 healthcare inquiries in the U.S. originating from "hospital deserts" - areas more than a 30-minute drive from general medical or children's hospitals

1

.

Despite this massive uptake, ChatGPT Health remains on a waitlist with limited access as OpenAI continues development

1

2

. An OpenAI spokesperson characterized the tool as a work in progress, with safety updates and improvements still forthcoming to enhance how the chatbot handles sensitive situations

1

. The company argued that the study evaluates immediate triage decisions in a controlled setting, whereas real-world scenarios typically involve follow-up questions that can change risk assessment

1

.

Regulation and Real-World Implications

As users increasingly seek AI healthcare tools for medical guidance, the technology continues expanding into the healthcare industry through a friendly regulatory environment. AI tools can now renew prescriptions in Utah, and FDA Commissioner Marty Makary stated that some devices and software can provide health information without FDA regulation

1

. However, this regulatory flexibility raises questions about oversight as AI systems demonstrate significant gaps in recognizing critical symptoms.

"Before you roll something like this out to make life-affecting decisions, you need to rigorously test it," Dr. John Mafi of UCLA Health emphasized

2

. While AI healthcare has potential for people living far from medical facilities or facing long appointment wait times, researchers remain unambiguous: in a real emergency, call a professional for help rather than consulting a chatbot first

2

. The study underscores that AI systems designed to provide health advice must undergo rigorous testing before deployment, particularly when life-threatening conditions hang in the balance.

Today's Top Stories

TheOutpost.ai

Your Daily Dose of Curated AI News

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

© 2026 Triveous Technologies Private Limited
Instagram logo
LinkedIn logo