3 Sources
[1]
OpenAI improves health responses for free ChatGPT users
OpenAI announced that GPT-5.5 Instant, the default model for free ChatGPT users, now performs comparably to its frontier Thinking models on health-related questions, according to the company's evaluations. This update arrives amid growing scrutiny of AI-generated health answers, as highlighted by a recent Guardian investigation revealing inaccuracies in some Google AI Overviews, which led to the removal of certain Google AI features. OpenAI claims GPT-5.5 Instant outperforms its predecessor, GPT-5.3 Instant, in evaluations conducted by its internal benchmarks, HealthBench and HealthBench Professional. The company reported a 71% decline in the rate of health responses flagged for factuality issues over two months, based on monitoring of live traffic. In a separate comparison, OpenAI had physicians write responses to representative health queries. A distinct panel of doctors rated GPT-5.5 Instant's responses higher than those crafted by physicians in measures of accuracy, communication, and completeness across a pool of 3,500 reviewed interactions. The company stated that the new model displayed fewer failure modes than previous versions and physicians, with reduced instances of missing important red flags or failing to request additional user context. HealthBench, the benchmark used by OpenAI, was developed with input from over 260 physicians across 60 countries, who have assessed over 700,000 example responses. The figure of 260 physicians has been consistent since the launch of ChatGPT Health in January; however, results from the evaluations have not been made available for external review. OpenAI indicated that more than 230 million users inquire about health and wellness topics through ChatGPT weekly, highlighting this use case as one of the most significant for the chatbot. Health topics are also prioritized in OpenAI's policies, which prohibit running advertisements in health-related discussions. According to an Ahrefs analysis, medical queries receive the highest exposure rate for AI-generated answers, indicating a potential shift in demand towards ChatGPT's free tier. OpenAI's claims about accuracy are based on in-house evaluations, creating challenges for external validation. The future implications of these developments, particularly regarding how they may affect citations and the responsibility of practitioners for verifying AI responses, remain unclear.
[2]
OpenAI Boosts ChatGPT's Medical Knowledge to Deliver Better Health Answers
OpenAI has improved ChatGPT's ability to answer health questions with clearer and more reliable information. The update could change how people search for medical advice online and prepare for doctor visits. OpenAI has made ChatGPT better at answering health-related questions. The company mentioned that in recent times, many users have been asking the AI tool health-related questions. However, the previous version of the AI tool wasn't that effective for medical queries, so the company is now bringing an advanced version to provide clearer and more useful responses when users ask about symptoms, illnesses, treatments, and everyday health concerns. OpenAI hopes the changes will help people understand medical topics more easily before speaking with a healthcare professional. The AI giant has also mentioned that ChatGPT is not meant to replace doctors. Instead, it is designed to help users learn more about health issues and make informed decisions.
[3]
OpenAI improves health intelligence in ChatGPT, here is how
OpenAI noted that GPT-5.5 Instant now performs at a level comparable to some of its most advanced reasoning models on difficult health evaluations. OpenAI has announced major improvements to ChatGPT's health intelligence. The company says more than 230 million people use ChatGPT every week for health-related topics, including understanding symptoms, checking lab reports, preparing for doctor visits, managing insurance and building healthier habits. According to OpenAI, the GPT-5.5 Instant model shows big progress in handling health conversations. The company says the model is now better at identifying situations that may require urgent medical attention, asking follow-up questions when more information is needed, and explaining complex medical topics in simpler language. OpenAI noted that GPT-5.5 Instant now performs at a level comparable to some of its most advanced reasoning models on difficult health evaluations. Since the model is available to free ChatGPT users, more people can benefit from the improvements. Also read: Anthropic CEO Dario Amodei takes aim at OpenAI, says we'll see who wins The company says good health responses are not only about providing accurate information. They also require good judgment. This means knowing when there's not enough information, communicating uncertainty clearly and helping users understand when they need to seek professional medical care. To improve these abilities, OpenAI works with a large network of doctors from around the world. More than 260 physicians across 60 countries, 49 languages, and 26 medical specialties help review ChatGPT's responses and identify areas where it can improve. "Their feedback becomes rubrics and evaluation criteria that help researchers measure whether responses are accurate, safe, clear, complete, appropriately cautious, and useful in real-world health situations. This gives us a clearer way to see where models are getting better and where they still need work," OpenAI explained. Also read: Will your next iPhone cost more? Here is what Tim Cook has to say about it The company also compared ChatGPT's responses with answers written by physicians. In many cases, GPT-5.5 Instant showed fewer common mistakes than older AI models. The AI firm stated, "GPT‑5.5 Instant had fewer instances of not tailoring to local healthcare context, missing red flags or referral to care, or failing to seek additional context from the user when needed than both older models and physicians." "Improving human health will be one of the most personal, tangible impacts of AGI," OpenAI said, adding that the company's goal is to make ChatGPT "more accurate, more useful, and more impactful."
Share
Copy Link
OpenAI announced that GPT-5.5 Instant now delivers health responses comparable to its advanced reasoning models, with a 71% drop in factuality issues. The update comes as 230 million users turn to ChatGPT weekly for health-related queries, from understanding symptoms to preparing for doctor visits, raising questions about AI's growing role in medical information access.
OpenAI announced that GPT-5.5 Instant, the default AI model for health queries available to free ChatGPT users, now performs at levels comparable to its frontier Thinking models on health-related questions
1
. The update addresses mounting concerns about AI-generated health information accuracy, particularly following a Guardian investigation that exposed factual inaccuracies in Google AI Overviews and prompted removal of certain features1
. More than 230 million users now turn to ChatGPT weekly for health and wellness topics, making it one of the chatbot's most significant use cases1
3
. Users rely on the platform for understanding symptoms, checking lab reports, preparing for doctor visits, managing insurance, and building healthier habits3
.
Source: Analytics Insight
OpenAI claims GPT-5.5 Instant outperforms its predecessor, GPT-5.3 Instant, based on internal benchmarks including HealthBench and HealthBench Professional
1
. The company reported a 71% decline in the rate of health responses flagged for factuality issues over two months, based on monitoring of live traffic1
. The model demonstrates improved health intelligence capabilities across multiple dimensions, including better identification of urgent medical situations, asking follow-up questions when additional information is needed, and explaining complex medical topics in simpler language3
. OpenAI emphasized that delivering reliable responses to health-related questions requires not just accurate information but good judgment—knowing when there's insufficient information, communicating uncertainty clearly, and helping users understand when to seek professional medical care3
.In a separate physician-led comparison, OpenAI had doctors write responses to representative health queries, then had a distinct panel of healthcare professionals rate the outputs
1
. The evaluation revealed that GPT-5.5 Instant's responses scored higher than those crafted by physicians across measures of accuracy, communication, and completeness in a pool of 3,500 reviewed interactions1
. The model displayed fewer failure modes than previous versions and physicians, with reduced instances of missing important red flags or failing to request additional user context1
. OpenAI also noted that GPT-5.5 Instant had fewer instances of not tailoring to local healthcare context compared to both older models and physicians3
.Related Stories
HealthBench, the benchmark used by OpenAI, was developed with input from more than 260 physicians across 60 countries, 49 languages, and 26 medical specialties who have assessed over 700,000 example responses
1
3
. Their feedback becomes rubrics and evaluation criteria that help researchers measure whether responses are accurate, safe, clear, complete, appropriately cautious, and useful in real-world health situations3
. However, the figure of 260 physicians has remained consistent since ChatGPT Health launched in January, and results from the evaluations have not been made available for external review1
. This creates challenges for external validation of OpenAI's accuracy claims, which remain based on in-house evaluations1
.OpenAI clarified that ChatGPT is not meant to replace doctors but is designed to help users learn more about health issues and make informed decisions before speaking with a healthcare professional
2
. According to an Ahrefs analysis, medical queries receive the highest exposure rate for AI-generated answers, indicating a potential shift in demand towards ChatGPT's free tier1
. Health topics are prioritized in OpenAI's policies, which prohibit running advertisements in health-related discussions1
. The company stated that "improving human health will be one of the most personal, tangible impacts of AGI," with the goal of making ChatGPT "more accurate, more useful, and more impactful"3
. The future implications regarding how these developments may affect citations and the responsibility of practitioners for verifying AI responses remain unclear1
.Summarized by
Navi
[1]
[2]
05 Aug 2025•Technology

05 Jan 2026•Technology

23 Apr 2026•Technology

1
Policy and Regulation

2
Policy and Regulation

3
Business and Economy
