11 Sources
11 Sources
[1]
Dr. Google" had its issues. Can ChatGPT Health do better?
For the past two decades, there's been a clear first step for anyone who starts experiencing new medical symptoms: Look them up online. The practice was so common that it gained the pejorative moniker "Dr. Google." But times are changing, and many medical-information seekers are now using LLMs. According to OpenAI, 230 million people ask ChatGPT health-related queries each week. That's the context around the launch of OpenAI's new ChatGPT Health product, which debuted earlier this month. It landed at an inauspicious time: Two days earlier, the news website SFGate had broken the story of Sam Nelson, a teenager who died of an overdose last year after extensive conversations with ChatGPT about how best to combine various drugs. In the wake of both pieces of news, multiple journalists questioned the wisdom of relying for medical advice on a tool that could cause such extreme harm. Though ChatGPT Health lives in a separate sidebar tab from the rest of ChatGPT, it isn't a new model. It's more like a wrapper that provides one of OpenAI's preexisting models with guidance and tools it can use to provide health advice -- including some that allow it to access a user's electronic medical records and fitness app data, if granted permission. There's no doubt that ChatGPT and other large language models can make medical mistakes, and OpenAI emphasizes that ChatGPT Health is intended as an additional support, rather than a replacement for one's doctor. But when doctors are unavailable or unable to help, people will turn to alternatives. Some doctors see LLMs as a boon for medical literacy. The average patient might struggle to navigate the vast landscape of online medical information -- and, in particular, to distinguish high-quality sources from polished but factually dubious websites -- but LLMs can do that job for them, at least in theory. Treating patients who had searched for their symptoms on Google required "a lot of attacking patient anxiety [and] reducing misinformation," says Marc Succi, an associate professor at Harvard Medical School and a practicing radiologist. But now, he says, "you see patients with a college education, a high school education, asking questions at the level of something an early med student might ask." The release of ChatGPT Health, and Anthropic's subsequent announcement of new health integrations for Claude, indicate that the AI giants are increasingly willing to acknowledge and encourage health-related uses of their models. Such uses certainly come with risks, given LLMs' well-documented tendencies to agree with users and make up information rather than admit ignorance. But those risks also have to be weighed against potential benefits. There's an analogy here to autonomous vehicles: When policymakers consider whether to allow Waymo in their city, the key metric is not whether its cars are ever involved in accidents but whether they cause less harm than the status quo of relying on human drivers. If Dr. ChatGPT is an improvement over Dr. Google -- and early evidence suggests it may be -- it could potentially lessen the enormous burden of medical misinformation and unnecessary health anxiety that the internet has created. Pinning down the effectiveness of a chatbot such as ChatGPT or Claude for consumer health, however, is tricky. "It's exceedingly difficult to evaluate an open-ended chatbot," says Danielle Bitterman, the clinical lead for data science and AI at the Mass General Brigham health-care system. Large language models score well on medical licensing examinations, but those exams use multiple-choice questions that don't reflect how people use chatbots to look up medical information. Sirisha Rambhatla, an assistant professor of management science and engineering at the University of Waterloo, attempted to close that gap by evaluating how GPT-4o responded to licensing exam questions when it did not have access to a list of possible answers. Medical experts who evaluated the responses scored only about half of them as entirely correct. But multiple-choice exam questions are designed to be tricky enough that the answer options don't give them entirely away, and they're still a pretty distant approximation for the sort of thing that a user would type into ChatGPT. A different study, which tested GPT-4o on more realistic prompts submitted by human volunteers, found that it answered medical questions correctly about 85% of the time. When I spoke with Amulya Yadav, an associate professor at Pennsylvania State University who runs the Responsible AI for Social Emancipation Lab and led the study, he made it clear that he wasn't personally a fan of patient-facing medical LLMs. But he freely admits that, technically speaking, they seem up to the task -- after all, he says, human doctors misdiagnose patients 10% to 15% of the time. "If I look at it dispassionately, it seems that the world is gonna change, whether I like it or not," he says. For people seeking medical information online, Yadav says, LLMs do seem to be a better choice than Google. Succi, the radiologist, also concluded that LLMs can be a better alternative to web search when he compared GPT-4's responses to questions about common chronic medical conditions with the information presented in Google's knowledge panel, the information box that sometimes appears on the right side of the search results. Since Yadav's and Succi's studies appeared online, in the first half of 2025, OpenAI has released multiple new versions of GPT, and it's reasonable to expect that GPT-5.2 would perform even better than its predecessors. But the studies do have important limitations: They focus on straightforward, factual questions, and they examine only brief interactions between users and chatbots or web search tools. Some of the weaknesses of LLMs -- most notably their sycophancy and tendency to hallucinate -- might be more likely to rear their heads in more extensive conversations and with people who are dealing with more complex problems. Reeva Lederman, a professor at the University of Melbourne who studies technology and health, notes that patients who don't like the diagnosis or treatment recommendations that they receive from a doctor might seek out another opinion from an LLM -- and the LLM, if it's sycophantic, might encourage them to reject their doctor's advice. Some studies have found that LLMs will hallucinate and exhibit sycophancy in response to health-related prompts. For example, one study showed that GPT-4 and GPT-4o will happily accept and run with incorrect drug information included in a user's question. In another, GPT-4o frequently concocted definitions for fake syndromes and lab tests mentioned in the user's prompt. Given the abundance of medically dubious diagnoses and treatments floating around the internet, these patterns of LLM behavior could contribute to the spread of medical misinformation, particularly if people see LLMs as trustworthy. OpenAI has reported that the GPT-5 series of models is markedly less sycophantic and prone to hallucination than their predecessors, so the results of these studies might not apply to ChatGPT Health. The company also evaluated the model that powers ChatGPT Health on its responses to health-specific questions, using their publicly available HeathBench benchmark. HealthBench rewards models that express uncertainty when appropriate, recommend that users seek medical attention when necessary, and refrain from causing users unnecessary stress by telling them their condition is more serious that it truly is. It's reasonable to assume that the model underlying ChatGPT Health exhibited those behaviors in testing, though Bitterman notes that some of the prompts in HealthBench were generated by LLMs, not users, which could limit how well the benchmark translates into the real world. An LLM that avoids alarmism seems like a clear improvement over systems that have people convincing themselves they have cancer after a few minutes of browsing. And as large language models, and the products built around them, continue to develop, whatever advantage Dr. ChatGPT has over Dr. Google will likely grow. The introduction of ChatGPT Health is certainly a move in that direction: By looking through your medical records, ChatGPT can potentially gain far more context about your specific health situation than could be included in any Google search, although numerous experts have cautioned against giving ChatGPT that access for privacy reasons. Even if ChatGPT Health and other new tools do represent a meaningful improvement over Google searches, they could still conceivably have a negative effect on health overall. Much as automated vehicles, even if they are safer than human-driven cars, might still prove a net negative if they encourage people to use public transit less, LLMs could undermine users' health if they induce people to rely on the internet instead of human doctors, even if they do increase the quality of health information available online. Lederman says that this outcome is plausible. In her research, she has found that members of online communities centered on health tend to put their trust in users who express themselves well, regardless of the validity of the information they are sharing. Because ChatGPT communicates like an articulate person, some people might trust it too much, potentially to the exclusion of their doctor. But LLMs are certainly no replacement for a human doctor -- at least not yet.
[2]
Giving your healthcare info to a chatbot is, unsurprisingly, a terrible idea
Every week, more than 230 million people ask ChatGPT for health and wellness advice, according to OpenAI. The company says that many see the chatbot as an "ally" to help navigate the maze of insurance, file paperwork, and become better self-advocates. In exchange, it hopes you will trust its chatbot with details about your diagnoses, medications, test results, and other private medical information. But while talking to a chatbot may be starting to feel a bit like the doctor's office, it isn't one. Tech companies aren't bound by the same obligations as medical providers. Experts tell The Verge it would be wise to carefully consider whether you want to hand over your records. Health and wellness is swiftly emerging as a key battleground for AI labs and a major test for how willing users are to welcome these systems into their lives. This month two of the industry's biggest players made overt pushes into medicine. OpenAI released ChatGPT Health, a dedicated tab within ChatGPT designed for users to ask health-related questions in what it says is a more secure and personalized environment. Anthropic introduced Claude for Healthcare, a "HIPAA-ready" product it says can be used by hospitals, health providers, and consumers. (Notably absent is Google, whose Gemini chatbot is one of the world's most competent and widely used AI tools, though the company did announce an update to its MedGemma medical AI model for developers.) OpenAI actively encourages users to share sensitive information like medical records, lab results, and health and wellness data from apps like Apple Health, Peloton, Weight Watchers, and MyFitnessPal with ChatGPT Health in exchange for deeper insights. It explicitly states that users' health data will be kept confidential and won't be used to train AI models, and that steps have been taken to keep data secure and private. OpenAI says ChatGPT Health conversations will also be held in a separate part of the app, with users able to view or delete Health "memories" at any time. OpenAI's assurances that it will keep users' sensitive data safe have been helped in no small way by the company launching an identical-sounding product with tighter security protocols at almost the same time as ChatGPT Health. The tool, called ChatGPT for Healthcare, is part of a broader range of products sold to support businesses, hospitals, and clinicians working with patients directly. OpenAI's suggested uses include streamlining administrative work like drafting clinical letters and discharge summaries and helping physicians collate the latest medical evidence to improve patient care. Similar to other enterprise-grade products sold by the company, there are greater protections in place than offered to general consumers, especially free users, and OpenAI says the products are designed to comply with the privacy obligations required of the medical sector. Given the similar names and launch dates -- ChatGPT for Healthcare was announced the day after ChatGPT Health -- it is all too easy to confuse the two and presume the consumer-facing product has the same level of protection as the more clinically oriented one. Numerous people I spoke to when reporting this story did so. Whichever security assurance we take, however, it is far from watertight. Users for tools like ChatGPT Health often have little safeguarding against breaches or unauthorized use beyond what's in the terms of use and privacy policies, experts tell The Verge. As most states haven't enacted comprehensive privacy laws -- and there isn't a comprehensive federal privacy law -- data protection for AI tools like ChatGPT Health "largely depends on what companies promise in their privacy policies and terms of use," says Sara Gerke, a law professor at the University of Illinois Urbana-Champaign. Even if you trust a company's vow to safeguard your data -- OpenAI says it encrypts Health data by default -- it might just change its mind. "While ChatGPT does state in their current terms of use that they will keep this data confidential and not use them to train their models, you are not protected by law, and it is allowed to change terms of use over time," explains Hannah van Kolfschooten, a researcher in digital health law at the University of Basel in Switzerland. "You will have to trust that ChatGPT does not do so." Carmel Shachar, an assistant clinical professor of law at Harvard Law School, concurs: "There's very limited protection. Some of it is their word, but they could always go back and change their privacy practices." Assurances that a product is compliant with data protection laws governing the healthcare sector like the Health Insurance Portability and Accountability Act, or HIPAA, shouldn't offer much comfort either, Shachar says. While great as a guide, there's little at stake if a company that voluntarily complies fails to do so, she explains. Voluntarily complying isn't the same as being bound. "The value of HIPAA is that if you mess up, there's enforcement." It's more than just privacy. There's a reason why medicine is a heavily regulated field -- errors can be dangerous, even lethal. There are no shortage of examples showing chatbots confidently spouting false or misleading health information, such as when a man developed a rare condition after he asked ChatGPT about removing salt from his diet and the chatbot suggested he replace salt with the sodium bromide, which was historically used as a sedative. Or when Google's AI Overviews wrongly advised people with pancreatic cancer to avoid high-fat foods -- the exact opposite of what they should be doing. To address this, OpenAI explicitly states that their consumer-facing tool is designed to be used in close collaboration with physicians and is not intended for diagnosis and treatment. Tools designed for diagnosis and treatment are designated as medical devices and are subject to much stricter regulations, such as clinical trials to prove they work and safety monitoring once deployed. Although OpenAI is fully and openly aware that one of the major use cases of ChatGPT is supporting users' health and well-being -- recall the 230 million people asking for advice each week -- the company's assertion that it is not intended as a medical device carries a lot of weight with regulators, Gerke explains. "The manufacturer's stated intended use is a key factor in the medical device classification," she says, meaning companies that say tools aren't for medical use will largely escape oversight even if products are being used for medical purposes. It underscores the regulatory challenges technology like chatbots are posing. For now, at least, this disclaimer keeps ChatGPT Health out of the purview of regulators like the Food and Drug Administration, but van Kolfschooten says it's perfectly reasonable to ask whether or not tools like this should really be classified as a medical device and regulated as such. It's important to look at how it's being used, as well as what the company is saying, she explains. When announcing the product, OpenAI suggested people could use ChatGPT Health to interpret lab results, track health behavior, or help them reason through treatment decisions. If a product is doing this, one could reasonably argue it might fall under the US definition of a medical device, she says, suggesting that Europe's stronger regulatory framework may be the reason why it's not available in the region yet. Despite claiming ChatGPT is not to be used for diagnosis or treatment, OpenAI has gone through a great deal of effort to prove that ChatGPT is a pretty capable medic and encourage users to tap it for health queries. The company highlighted health as a major use case when launching GPT-5, and CEO Sam Altman even invited a cancer patient and her husband on stage to discuss how the tool helped her make sense of the diagnosis. The company says it assesses ChatGPT's medical prowess against a benchmark it developed itself with more than 260 physicians across dozens of specialties, HealthBench, that "tests how well AI models perform in realistic health scenarios," though critics note it is not very transparent. Other studies -- often small, limited, or run by the company itself -- hint at ChatGPT's medical potential too, showing that in some cases it can pass medical licensing exams, communicate better with patients, and outperform doctors at diagnosing illness, as well as help doctors make fewer mistakes when used as a tool. OpenAI's efforts to present ChatGPT Health as an authoritative source of health information could also undermine any disclaimers it includes telling users not to utilize it for medical purposes, van Kolfschooten says. "When a system feels personalized and has this aura of authority, medical disclaimers will not necessarily challenge people's trust in the system." Companies like OpenAI and Anthropic are hoping they have that trust as they jostle for prominence in what they see as the next big market for AI. The figures showing how many people are already using AI chatbots for health suggest they may be onto something, and given the stark health inequalities and difficulties many face in accessing even basic care, this could be a good thing. At least, it could be, if that trust is well-placed. We trust our private information with healthcare providers because the profession has earned that trust. It's not yet clear whether an industry with a reputation for moving fast and breaking things has earned the same.
[3]
This might be the best way to get medical advice from ChatGPT - and it's free to use
Follow ZDNET: Add us as a preferred source on Google. ZDNET's key takeaways * OpenAI announced a Health mode for ChatGPT. * You can connect health apps and upload your personal files. * The advice isn't intended to replace actual medical care. OpenAI wants to make sure that any medical information you get from ChatGPT is as accurate as possible. Approximately 40 million people a day rely on ChatGPT for medical questions. In response, OpenAI announced ChatGPT Health, a "dedicated experience" in ChatGPT that's centered around health and wellness. The feature will enable you to combine your medical records and wearable data with the AI's intelligence, "to ground conversations in your own health information," according to OpenAI. You can use it to help you prepare for your next appointment, plan questions to ask your doctor, receive customized diet plans or workout routines, and more. Also: I've studied AI for decades - why you must be polite to chatbots (and it's not for the AI's sake) OpenAI notes that the feature is not intended to replace medical care, nor is it designed for diagnosis or treatment. Instead, the goal is for you to ask everyday questions and understand patterns related to your whole medical profile. What ChatGPT can see about your health According to OpenAI, one of the biggest challenges people face when seeking medical guidance online is that information is scattered across provider-specific portals, wearable apps, and personal notes. As a result, it can be hard to get a good overview. Also: 10 ChatGPT prompt tricks I use - to get the best results, faster This leads people to turn to ChatGPT, which often only gets a partial view of the picture, too. However, now you can provide ChatGPT with data from your personal medical records and from any health-tracking apps you use. OpenAI said that you can connect a variety of sources: * Medical Records for lab results, visit summaries, and clinical history * Apple Health for health and fitness data, including movement, sleep, and activity patterns * Function for lab test insights, nutrition ideas, and taking action on your health * MyFitnessPal for nutrition advice, macros, and recipes * Weight Watchers for GLP-1 personalized meal ideas, recipes, and food guidance * AllTrails to help you find your next hike * Instacart to turn meal plans into shoppable lists * Peloton for suggested workout classes or guided meditations Keeping your info secure OpenAI acknowledged that health information is "deeply personal," so it's adding extra protections to ChatGPT Health. The company said, "Conversations and files across ChatGPT are encrypted by default at rest and in transit as part of our core security architecture. . . . Health builds on this foundation with additional, layered protections -- including purpose-built encryption and isolation -- to keep health conversations protected and compartmentalized." Also: True agentic AI is years away - here's why and how we get there The company also said that Health conversations will not be used for its foundation model training. Health is its own memory and its own space in ChatGPT, and your Health conversations, connected apps, and files won't mix with your other chats. However, some information from your regular ChatGPT sessions may surface in your Health space when applicable. How to try ChatGPT Health To try out ChatGPT Health, you'll need to join the waitlist. OpenAI said it's providing access to a small group of early users to "continue refining the experience." As it makes improvements, the company will expand the feature to all users across the web and iOS over the next few weeks. When you have access, select Health from the sidebar menu in ChatGPT to get started.
[4]
ChatGPT Health promises to personalise health information. It comes with many risks
Many of us already use generative artificial intelligence (AI) tools such as ChatGPT for health advice. They give quick, confident and personalised answers, and the experience can feel more private than speaking to a human. Now, several AI companies have unveiled dedicated "health and wellness" tools. The most prominent is ChatGPT Health, launched by OpenAI earlier this month. ChatGPT Health promises to generate more personalised answers, by allowing users to link medical records and wellness apps, upload diagnostic imaging and interpret test results. But how does it really work? And is it safe? Most of what we know about this new tool comes from the company that launched it, and questions remain about how ChatGPT Health would work in Australia. Currently, users in Australia can sign up for a waitlist to request access. Let's take a look. AI health advice is booming Data from 2024 shows 46% of Australians had recently used an AI tool. Health queries are popular. According to OpenAI, one in four regular ChatGPT users worldwide submit a health-related prompt each week. Our 2024 study estimated almost one in ten Australians had asked ChatGPT a health query in the previous six months. This was more common for groups that face challenges finding accessible health information, including: * people born in a non-English speaking country * those who spoke another language at home * people with limited health literacy. Among those who hadn't recently used ChatGPT for health, 39% were considering using it soon. How accurate is the advice? Independent research consistently shows generative AI tools do sometimes give unsafe health advice, even when they have access to a medical record. There are several high-profile examples of AI tools giving unsafe health advice, including when ChatGPT allegedly encouraged suicidal thoughts. Recently, Google removed several AI Overviews on health topics - summaries which appear at the top of search results - after a Guardian investigation found the advice about blood tests results was dangerous and misleading. This was just one health prompt they studied. There could be much more advice the AI is getting wrong we don't know about yet. So, what's new about ChatGPT Health? The AI tool has several new features aimed to personalise its answers. According to OpenAI, users will be able to connect their ChatGPT Health account with medical records and smartphone apps such as MyFitnessPal. This would allow the tool to use personal data about diagnoses, blood tests, and monitoring, as well as relevant context from the user's general ChatGPT conversations. OpenAI emphasises information doesn't flow the other way: conversations in ChatGPT Health are kept separate from general ChatGPT, with stronger security and privacy. The company also says ChatGPT Health data won't be used to train foundation models. OpenAI says it has worked with more than 260 clinicians in 60 countries (including Australia), to give feedback on and improve the quality of ChatGPT Health outputs. In theory, all of this means ChatGPT Health could give more personalised answers compared to general ChatGPT, with greater privacy. But are there still risks? Yes. OpenAI openly states ChatGPT Health is not designed to replace medical care and is not intended for diagnosis or treatment. It can still make mistakes. Even if ChatGPT Health has access to your health data, there is very little information about how accurate and safe the tool is, and how well it has summarised the sources it has used. The tool has not been independently tested. It's also unclear whether ChatGPT Health would be considered a medical device and regulated as one in Australia. The tool's responses may not reflect Australian clinical guidelines, our health systems and services, and may not meet the needs of our priority populations. These include First Nations people, those from culturally and linguistically diverse backgrounds, people with disability and chronic conditions, and older adults. We don't know yet if ChatGPT Health will meet data privacy and security standards we typically expect for medical records in Australia. Currently, many Australians' medical records are incomplete due to patchy uptake of MyHealthRecord, meaning even if you upload your medical record, the AI may not have the full picture of your medical history. For now, OpenAI says medical record and some app integrations are only available in the United States. So, what's the best way to use ChatGPT for health questions? In our research, we have worked with community members to create short educational materials that help people think about the risks that come with relying on AI for health advice, and to consider other options. Higher risk Health questions that would usually require clinical expertise to answer carry more risk of serious consequences. This could include: * finding out what symptoms mean * asking for advice about treatment * interpreting test results. AI responses can often seem sensible - and increasingly personalised - but that doesn't necessarily mean they are correct or safe. So, for these higher-risk questions, the best option is always to speak with a health professional. Lower risk Other health questions are less risky. These tend to be more general, such as: * learning about a health condition or treatment option * understanding medical terms * brainstorming what questions to ask during a medical appointment. Ideally, AI is just one of the information sources you use. Where else can I get free advice? In Australia we have a free 24/7 national phone service, where anyone can speak with a registered nurse about their symptoms: 1800 MEDICARE (1800 633 422). Symptom Checker, operated by healthdirect, is another publicly funded, evidence-based tool that will help you understand your next steps and connect you with local services. AI tools are here to stay For now, we need clear, reliable, independent, and publicly available information about how well the current tools work and the limits of what they can do. This information must be kept up-to-date as the tools evolve. Purpose-built AI health tools could transform how people gain knowledge, skills and confidence to manage their health. But these need to be designed with communities and clinicians, and prioritise accuracy, equity and transparency. It is also essential to equip our diverse communities with the knowledge and skills to navigate this new technology safely.
[5]
Column | I let ChatGPT analyze a decade of my Apple Watch data. Then I called my doctor.
Geoff's column hunts for how tech can make your life better -- and advocates for you when tech lets you down. Got a question or topic to investigate? [email protected] and geoffreyfowler.88 on Signal. ChatGPT now says it can answer personal questions about your health using data from your fitness tracker and medical records. The new ChatGPT Health claims that it can help you "understand patterns over time -- not just moments of illness -- so you can feel more informed." Like many people who strap on an Apple Watch every day, I've long wondered what a decade of that data might reveal about me. So I joined a brief wait list and gave ChatGPT access to the 29 million steps and 6 million heartbeat measurements stored in my Apple Health app. Then I asked the bot to grade my cardiac health. It gave me an F. I freaked out and went for a run. Then I sent ChatGPT's report to my actual doctor. Am I an F? "No," my doctor said. In fact, I'm at such low risk for a heart attack that my insurance probably wouldn't even pay for an extra cardio fitness test to prove the artificial intelligence wrong. I also showed the results to cardiologist Eric Topol of the Scripps Research Institute, an expert on both longevity and the potential of AI in medicine. "It's baseless," he said. "This is not ready for any medical advice." AI has huge potential to unlock medical insights and widen access to care. But when it comes to your fitness tracker and some health records, the new Dr. ChatGPT seems to be winging it. That fits a disturbing trend: AI companies launching products that are broken, fail to deliver or are even dangerous. It should go without saying that people's health actually matters. Any product -- even one labeled "beta" -- that claims to provide personal health insights shouldn't be this clueless. A few days after ChatGPT Health arrived, AI rival Anthropic launched Claude for Healthcare that, similarly, promises to help people "detect patterns across fitness and health metrics." Anyone with a paid account can import Apple Health and Android Health Connect data into the chatbot. Claude graded my cardiac health a C, relying on some of the same analysis that Topol found questionable. OpenAI and Anthropic say their bots can't replace doctors or provide diagnosis and include contextual disclaimers. Yet both products willingly provided detailed analysis of my cardiac health. (The Washington Post has a content partnership with OpenAI.) The companies also say their health bots are in early testing phases, but didn't specify how they planned to improve their ability to provide personal analysis of body data. Apple says it didn't work directly with either AI company on these products. The more I used ChatGPT Health, the worse things got. How ChatGPT misread my charts The whole premise of ChatGPT Health involves a leap of faith: handing over your most intimate health information to a data-gobbling AI company. People are right to be cautious about AI privacy. OpenAI says its Health mode takes extra steps to protect your privacy: it won't use your data to train its AI or mingle it with other chats, and it encrypts the data. Still, it's essentially a pinkie promise: ChatGPT isn't a health care provider, so it isn't covered by the federal health privacy law known as HIPAA. After I connected ChatGPT to Apple Health, I asked the bot to share what health recommendations it had for me, and to call out any red flags. It said I should talk with my doctor about a "notable increase" in my resting heart rate, but congratulated me on my sleep and activity levels. Then I asked the bot to "give me simple scores for heart health and longevity." That's when it handed me an F. I asked ChatGPT: "You really think my heart health and longevity deserve failing grades?" It responded: "Short answer: no -- I don't think you're a lost cause, and I'm sorry the raw letter grades felt harsh. They were a blunt, heuristic readout from device-only data." It said it could give me a more useful score if I also connected the medical records stored by my doctor's office. So I did that, and asked it again to "give me a simple score (A-F) for my cardiovascular health over the last decade." The grade ticked up to D. Topol was appalled when I showed him the bot's analysis. Despite having access to my weight, blood pressure and cholesterol, ChatGPT based much of its negative assessment on an Apple Watch measurement known as VO2 max, the maximum amount of oxygen your body can consume during exercise. Apple says it collects an "estimate" of VO2 max, but the real thing requires a treadmill and a mask. Apple says its cardio fitness measures have been validated, but independent researchers have found those estimates can run low -- by an average of 13 percent. ChatGPT's evaluation also emphasized an Apple Watch metric called heart-rate variability, which Topol said has lots of fuzziness. "You sure don't want to go with that as your main driver," he said. When I asked ChatGPT to chart my heart rate over the decade, I spotted another problem: There were big swings in my resting heart rate whenever I got a new Apple Watch, suggesting the devices may not have been tracking the same way. (Apple says it keeps making improvements to those measurements.) But once again, ChatGPT treated a fuzzy data point like a clear health signal. Claude's C grade for me was less panic-inducing, but it also wasn't sufficiently critical about the VO2 max data (which it graded a D+). Anthropic says there's no separate health-tuned version of Claude, and it can only provide general context for health data, not personalized clinical analysis. My real doctor said to do a deep dive on my cardiac health, we should check back in on my lipids, so he ordered another blood test that included Lipoprotein (a), a risk factor for heart disease. Neither ChatGPT Health nor Claude brought up the idea of doing that test. An erratic analysis Both AI companies say their health products are not designed to provide clinical assessments. Rather, they're to help you prepare for a visit to a doctor or get advice on how to approach your workout routine. I didn't ask their bots if I have heart disease. I asked them a pretty obvious question after uploading that much personal health data: How am I doing? What's more, if ChatGPT and Claude can't accurately grade your heart health, then why didn't the bots say, "Sorry, I can't do that?" The bots did decline to estimate at what age I might die. There was another problem I discovered over time: When I tried asking the same heart longevity-grade question again, suddenly my score went up to a C. I asked again and again, watching the score swing between an F and a B. Across conversations, ChatGPT kept forgetting important information about me, including my gender, age and some recent vital signs. It had access to my recent blood tests, but sometimes didn't use them in its analysis. That kind of randomness is "totally unacceptable," Topol said. "People that do this are going to get really spooked about their health. It could also go the other way and give people who are unhealthy a false sense that everything they're doing is great." OpenAI says it couldn't replicate the wild swings I saw. It says ChatGPT might weigh different connected data sources slightly differently from one conversation to the next as it interprets large health datasets. It also says it's working to make responses more stable before ChatGPT Health becomes available beyond its wait list. "Launching ChatGPT Health with waitlisted access allows us to learn and improve the experience before making it widely available," OpenAI vice president Ashley Alexander said in a statement. When I repeated the same query on Claude, my score varied between a C and B-. Anthropic said chatbots have inherent variation in outputs. Should you trust a bot with your health? I liked using ChatGPT Health to make plots of my Apple Watch data, and to ask more narrow questions such as how my activity level changed after I had kids. OpenAI says more than 230 million users already ask ChatGPT health and wellness questions every week. For those people, a more private way to import information and have chats about their bodies is a welcome improvement. But the question is: Should we be turning to this bot for those answers? OpenAI says it has worked with physicians to improve its health answers. When I've previously tested the quality of ChatGPT's responses to real medical questions with a leading doctor, the results ranged from excellent to potentially dangerous. The problem is ChatGPT typically answers with such confidence it's hard to tell the good results from the bad ones. Chatbot companies might be overselling their ability to answer personalized health questions, but there's little stopping them. Earlier this month, Food and Drug Administration Commissioner Marty Makary said the agency's job is to "get out of the way as a regulator" to promote AI innovation. He drew a red line at AI making "medical or clinical claims" without FDA review, but both ChatGPT and Claude insist they're just providing information. Scientists have worked for years to analyze long-term body data to predict disease. (In 2020, I participated in one such study with the Oura Ring.) What makes this kind of AI work so difficult, Topol told me, is that you have to account for noise and weaknesses in the data and also link it up to people's ultimate health outcomes. To do it right, you need a dedicated AI model that can connect all these layers of data. OpenAI's Alexander said ChatGPT Health was built with custom code that helps it organize and contextualize personal health data. But that's not the same as being trained to extract accurate and useful personal analysis from the complex data stored in Apple Watches and medical charts. Topol expected more. "You'd think they would come up with something much more sophisticated, aligned with practice of medicine and the knowledge base in medicine," Topol said. "Not something like this. This is very disappointing."
[6]
Apple Watch user gave ChatGPT Health his data, with troubling results
Earlier this month, ChatGPT Health launched with integration for Apple Health and several other data providers. But when The Washington Post's technology columnist gave ChatGPT access to his Apple Watch data, it didn't go very well. ChatGPT Health gave inconsistent, inaccurate interpretations of Apple Health data Geoffrey A. Fowler writes at The Washington Post: Like many people who strap on an Apple Watch every day, I've long wondered what a decade of that data might reveal about me. So I joined a brief wait list and gave ChatGPT access to the 29 million steps and 6 million heartbeat measurements stored in my Apple Health app. Then I asked the bot to grade my cardiac health. It gave me an F. I freaked out and went for a run. Then I sent ChatGPT's report to my actual doctor. Am I an F? "No," my doctor said. In fact, I'm at such low risk for a heart attack that my insurance probably wouldn't even pay for an extra cardio fitness test to prove the artificial intelligence wrong. Fowler explains that ChatGPT's grade seemed tied to several misinterpretations of his Apple Watch health data. For example, he says that ChatGPT Health "based much of its negative assessment on...VO2 max" despite Apple saying that its VO2 max readings are simply estimates. They're helpful for tracking trends, but getting truly precise data requires separate equipment. Additionally, resting heart rate changes that happened when Fowler got a new Apple Watch weren't true changes, but rather the result of updates to sensors and measurement tools. That didn't seem factored into ChatGPT's evaluation, however. Inconsistency in responses was another issue: When I tried asking the same heart longevity-grade question again, suddenly my score went up to a C. I asked again and again, watching the score swing between an F and a B. Across conversations, ChatGPT kept forgetting important information about me, including my gender, age and some recent vital signs. It had access to my recent blood tests but sometimes didn't use them in its analysis. For anyone who has used an AI chatbot, these issues probably won't be surprising. But for a product intended to be a source of health knowledge, it's especially concerning. 9to5Mac's Take Apple is rumored to be working on an AI-powered 'Health+' service for later this year. And this early report on ChatGPT Health makes two things clear: Have you given ChatGPT Health access to your Apple Health data? How has that gone for you? Let us know in the comments.
[7]
ChatGPT's Apple Health Integration Flaws Exposed in New Report
A reporter for The Washington Post has put ChatGPT's new optional Apple Health integration feature to the test by feeding it ten years of their Apple Watch data. The results were not encouraging, to say the least. Earlier this month, OpenAI announced the launch of ChatGPT Health, a dedicated section of ChatGPT where users can ask health-related questions completely separated from their main ChatGPT experience. For more personalized responses, users can connect various health data services such as Apple Health, Function, MyFitnessPal, Weight Watchers, AllTrails, Instacart, and Peloton. ChatGPT Health can also integrate with your medical records, allowing it to analyze your lab results and other aspects of your medical history to inform its answers to your health-related questions. With this in mind, reporter Geoffrey Fowler gave ChatGPT Health access to 29 million steps and 6 million heartbeat measurements from his Apple Health app, and asked the bot to grade his cardiac health. It gave him an F. Feeling understandably alarmed, Fowler asked his actual doctor, who in no uncertain terms dismissed the AI's assessment entirely. His physician said Fowler was at such low risk for heart problems that his insurance likely wouldn't even cover additional testing to disprove the chatbot's findings. Cardiologist Eric Topol of the Scripps Research Institute was likewise unimpressed with the large language model's assessment. He called ChatGPT's analysis "baseless" and said people should ignore its medical advice, as it's clearly not ready for prime time. Perhaps the most troubling finding, though, was ChatGPT's inconsistency. When Fowler asked the same question several times, his score swung wildly between an F and a B. ChatGPT also kept forgetting basic information about him, including his gender and age, despite it having full access to his records. Anthropic's Claude chatbot fared slightly better - though not by much. The LLM graded Fowler's cardiac health a C, but it also failed to properly account for limitations in the Apple Watch data. Both companies say their health tools aren't meant to replace doctors or provide diagnoses. Topol rightly argued that if these bots can't accurately assess health data, then they shouldn't be offering grades at all. Yet nothing appears to be stopping them. The U.S. Food and Drug Administration earlier this month said the agency's job is to "get out of the way as a regulator" to promote innovation. An agency commissioner drew a red line at AI making "medical or clinical claims" without FDA review, but ChatGPT and Claude argue they are just providing information. "People that do this are going to get really spooked about their health," Topol said. "It could also go the other way and give people who are unhealthy a false sense that everything they're doing is great." ChatGPT's Apple Health integration is currently limited to a group of beta users. Responding to the report, OpenAI said it was working to improve the consistency of the chatbot's responses. "Launching ChatGPT Health with waitlisted access allows us to learn and improve the experience before making it widely available," OpenAI VP Ashley Alexander told the publication in a statement.
[8]
Early tests suggest ChatGPT Health's assessment of your fitness data may cause unnecessary panic
Experts say the tool isn't ready to provide reliable personal health insights. Earlier this month, OpenAI introduced a new health focused space within ChatGPT, pitching it as a safer way for users to ask questions about sensitive topics like medical data, illnesses, and fitness. One of the headline features highlighted at launch was ChatGPT Health's ability to analyze data from apps like Apple Health, MyFitnessPal, and Peloton to surface long term trends and deliver personalized results. However, a new report suggests OpenAI may have overstated how effective the feature is at drawing reliable insights from that data. According to early tests conducted by The Washington Post's Geoffrey A. Fowler, when ChatGPT Health was given access to a decade's worth of Apple Health data, the chatbot graded the reporter's cardiac health an F. However, after reviewing the assessment, a cardiologist called it "baseless" and said the reporter's actual risk of heart disease was extremely low. Recommended Videos Dr. Eric Topol from the Scripps Research Institute offered a blunt assessment of ChatGPT Health's capabilities, saying the tool is not ready to offer medical advice and relied too heavily on unreliable smartwatch metrics. ChatGPT's grade leaned heavily on Apple Watch estimates of VO2 max and heart rate variability, both of which have known limitations and can vary significantly between devices and software builds. Independent research has found Apple Watch VO2 max estimates often run low, yet ChatGPT still treated them as clear indicators of poor health. ChatGPT Health gave different grades for the same data The problems did not stop there. When the reporter asked ChatGPT Health to repeat the same grading exercise, the score fluctuated between an F and a B across conversations, with the chatbot sometimes ignoring recent blood test reports it had access to and occasionally forgetting basic details like the reporter's age and gender. Anthropic's Claude for Healthcare, which also debuted earlier this month, showed similar consistencies, assigning grades that shifted between a C and a B minus. Both OpenAI and Anthropic have stressed that their tools are not meant to replace doctors and only provide general context. Still, both chatbots delivered confident, highly personalized evaluations of cardiovascular health. This combination of authority and inconsistency could scare healthy users or falsely reassure unhealthy ones. While AI may eventually unlock valuable insights from long term health data, early testing suggests that feeding years of fitness tracking data into these tools currently creates more confusion than clarity.
[9]
AI chatbot doctors are here. That's a problem
A version of this article originally appeared in Quartz's AI & Tech newsletter. Sign up here to get the latest AI & tech news, analysis and insights straight to your inbox. A year ago, Alex P., a writer in his mid-40s, had a calcium score that put him in the "moderate risk" category for heart disease. His doctors prescribed statins and moved on. But something nagged at him: The test appeared to show nearly all the buildup concentrated in one artery -- the left anterior descending, nicknamed the "widowmaker" because blockages there are so often fatal. His doctors told him not to read too much into it. That's not how the test works, they said. So he did what hundreds of millions of people do every week. He asked ChatGPT. The chatbot disagreed with his doctors. A high concentration of calcification in the LAD at his age could indicate serious risk, it told him. Take it literally. After months of pushing multiple physicians, Alex finally got a CT scan. It revealed a 95% blockage, exactly where the original test suggested. He got a stent days later. His doctors called it a fluke. A doctor friend of his told him ChatGPT got lucky. "I might have been saved by a hallucination," said Alex, who asked that his last name be withheld because he hasn't disclosed his cardiac history to everyone in his life. Alex has no idea what the truth is. Either way, he's grateful to be alive to debate the point. OpenAI says more than 230 million people already ask ChatGPT health questions every week. The new product adds guardrails -- conversations won't train the company's models, and health data stays siloed from regular chats -- while expanding what the AI can do with your information. The timing isn't coincidental. Anthropic, OpenAI's closest competitor, announced Claude for Healthcare a few days later, targeting both consumers and the insurance industry's paperwork. OpenAI also revealed it acquired Torch, a startup building "unified medical memory" for AI, for $60 million. The healthcare land grab is on. Both companies built their products with physician input and emphasize that AI is meant to support and not replace professional care. OpenAI says it has worked with more than 260 doctors across 60 countries. Anthropic has added connectors to medical databases to help insurers speed up prior authorization, the bureaucratic back-and-forth that often delays treatment. So far, these tools have helped people like Alex. They've also caused real harm. The same week OpenAI launched ChatGPT Health, Google $GOOGL and Character.AI agreed to settle multiple lawsuits from families whose teenagers died by suicide after forming relationships with AI chatbots. One 14-year-old was messaging with a bot that urged him to "come home" in the moments before he killed himself. OpenAI faces similar litigation. Both companies warn users that chatbots can hallucinate and shouldn't replace professional care -- then build products that can do exactly that. That's the tension at the heart of this product. Chatbots hallucinate. They form inappropriate attachments with vulnerable users. Their creators openly worry they could spiral out of control. And now they want these tools to be your health advisor. For the 25 million Americans without health insurance, a ChatGPT subscription might still be the closest thing to a second opinion they can afford. ChatGPT doesn't get tired. It doesn't rush through appointments or dismiss concerns to stay on schedule. It has, as Alex put it, "unlimited patience and unlimited time." In a system where the average primary care visit lasts 18 minutes, an AI that answers questions at 2 a.m. fills a genuine gap. But giving people better tools to navigate a broken system doesn't fix the system. ChatGPT can help you prepare questions for a doctor you can't afford to see. It can explain lab results from tests your insurance won't cover. A growing cohort of patients have started treating physicians as gatekeepers to regulated hardware. They snap photos of screens, grab the printouts, then take the real appointment home to their AI of choice. Alex was one of them. He had insurance. He had doctors. What he didn't have was anyone who would take his concerns seriously until a chatbot gave him the confidence to push back. Still, trust only goes so far. Alex plans to keep using AI for health questions. He just won't be consolidating anything. He'll screenshot a blood test and ask Gemini, then rephrase the answer and run it by ChatGPT. He doesn't trust any of these companies to do what's right with his data, so he's not handing any one of them the full picture. "I don't want all my health data in one place," Alex said. "I don't want to create one treasure trove that, once hacked, belongs to the entire world."
[10]
It Was Notorious for Getting Things Wrong. Now It's Assisting Your Doctor.
There's a key thing to keep in mind if you ask a chatbot for medical advice. Sign up for the Slatest to get the most insightful analysis, criticism, and advice out there, delivered to your inbox daily. Asking a general-use chatbot for health help used to seem like a shot in the dark -- just two years ago, a study found that ChatGPT could diagnose only 2 in 10 pediatric cases correctly. Among Google Gemini's early recommendations were eating one small rock a day and using glue to help cheese stick to pizza. Last year, a nutritionist ended up hospitalized after taking ChatGPT's advice to replace salt in his diet with sodium bromide. Now A.I. companies have begun releasing health-specific chatbots for both consumers and health care professionals. This month, OpenAI announced ChatGPT Health, which allows regular people to connect their medical records and health data to A.I. for (theoretically) more accurate responses to their health queries. It also released ChatGPT for Healthcare, a service that is already in use by hospitals across the country. OpenAI isn't the only one -- Anthropic announced its own chatbot, Claude for Healthcare, designed to help doctors with day-to-day tasks like retrieving medical records and to help patients better communicate with their providers. So how could these chatbots be an improvement over regular old chatbots? "When talking about something designed specifically for health care, it should be trained on health care data," says Torrey Creed, an associate professor of psychiatry researching A.I. at the University of Pennsylvania. This means that a chatbot shouldn't have the option to pull from unreliable sources like social media. The second difference, she says, is ensuring that users' private data isn't sold or used to train models. Chatbots created for the health care sector are required to be HIPAA compliant. Bots that prompt consumers to directly chat with them about symptoms are designed only to connect the dots, and protecting consumer data is a matter of having robust privacy settings. I spoke to Raina Merchant, the executive director of the Center for Health Care Transformation and Innovation at UPenn, about what patients need to know as they navigate the changing A.I. medical landscape, and how doctors are already applying the tech. Merchant says A.I. has a lot of potential -- but that, for now, it should be used with caution. How is the health care system currently using these chatbots and A.I.? It's a really exciting area. At Penn, we have a program called Chart Hero, which can be thought of like a ChatGPT embedded into a patient's health record. It's an A.I. agent I can prompt with specific questions to help find information in a chart or make calculations for risk scores or guidance. Since it's all embedded, I don't have to go look at separate sources. Using it, I can spend more time really talking to patients and have more of that human connection -- because I'm spending less time doing chart digging or synthesizing information from different areas. It's been a real game changer. There's a lot of work in the ambient space, where A.I. can listen after patients have consented and help generate notes. Then there's also a lot of work in messaging interfaces. We have a portal where patients can send questions at any time using A.I. to help identify ways, still with a human in the loop, to be able to accurately answer information. What does having a human in the loop look like? Many hospital chatbots are intentionally supervised by humans. What might feel automated is often supported by people behind the scenes. Having a human makes sure that there are some checks and balances. So a completely consumer-facing product like ChatGPT Health wouldn't have a human in the loop. You can just sit on the couch by yourself and have A.I. answer your health questions. What would you recommend that patients use ChatGPT Health for? What are the limitations? I think of A.I. chatbots as tools. They are not clinicians. Their goal is to make care easier to access and navigate. They are good at guidance, but not so much judgment. They can help you understand next steps, but I wouldn't use them for making medical decisions. I really like the idea of using it to think through questions to ask your doctor. Going to a medical appointment, people can have certain emotions. Feeling like you're going in more prepared, that you thought of all the questions, can be good. Let's say I have a low-grade fever. Is it a good idea to ask ChatGPT Health what to do? If you are at the point of making a decision, that's when I would engage a physician. I see real value in using the chatbot as a tool for understanding next steps but not for making a decision. So how reliable are these new health chatbots at diagnosing conditions? They have a tremendous amount of information that can be informative for both patients and clinicians. What we don't know yet is when they hallucinate, or when they veer from guidelines or recommendations. It won't be clear when the bot is making something up. There's a couple things that I tell patients: Check for consistency, go to trusted sources to validate information, and trust their instincts. If something sounds too good to be true, have a certain amount of hesitancy making any decisions based on the bot's information. What sources should patients be using to verify A.I.? I rely on the big recognizable names, like information from the American Heart Association or other large medical associations that might have guidelines or recommendations. When it gets to the question "Should I trust the chatbot?," that's probably when it's valuable to work with your health care professional. Is the data that patients put into health chatbots secure? My recommendation for any patient would be to not share personal details, like your name, address, medical record number, or prescription IDs, because it's not the environment we use for protecting patient information -- in the same way that I wouldn't enter my Social Security number into a random website or Google interface. Does this include health care chatbots provided through hospitals or health centers? If a hospital is providing a chatbot and [is very clear and transparent] about how the information is being used, and health information is protected, then I would feel comfortable entering my information there. But for something that didn't have transparency around who owns the data, how it's used, etc., I would not share my personal details.
[11]
ChatGPT's AI health care push has a fatal flaw
OpenAI and Anthropic have both announced big plans to enter health care, with a consumer-focused tool called ChatGPT Health and a version of the chatbot Claude that can help clinicians figure out a diagnosis and write medical notes. Notably absent from this flurry of announcements is Google. Its Gemini chatbot is one of the most popular and capable, so why not jump into the lucrative health market too? Perhaps because Google knows from experience that such an effort can backfire spectacularly. Health advice is where generative artificial intelligence has some of its most exciting potential. But the newer AI companies, perhaps blinded by bravado and hype, face a fate similar to Google's if they're not more transparent about their technology's notorious hallucinations.
Share
Share
Copy Link
OpenAI launched ChatGPT Health, allowing users to connect medical records and fitness apps for AI-driven health advice. But independent testing reveals troubling inaccuracies—one journalist received an F grade for heart health that doctors called baseless. With 230 million people already seeking health guidance from ChatGPT weekly, experts question whether the benefits outweigh the risks of misinformation and data privacy concerns.
OpenAI has unveiled ChatGPT Health, a dedicated tab within its popular chatbot designed specifically for health and wellness inquiries. According to OpenAI, approximately 230 million people ask ChatGPT health-related questions each week, making health one of the platform's most common use cases
1
. The new feature allows users to connect medical records and integrate data from wellness apps including Apple Health, MyFitnessPal, Weight Watchers, Peloton, and Function to receive what the company describes as more personalized health information3
. Almost simultaneously, Anthropic introduced Claude for Healthcare, a HIPAA-ready product targeting both consumers and healthcare providers, signaling that AI for health advice has become a key battleground among tech giants2
.
Source: Slate
The timing of ChatGPT Health's launch proved inauspicious. Just two days before the announcement, news broke about Sam Nelson, a teenager who died of an overdose after extensive conversations with ChatGPT about combining various drugs
1
. The incident immediately raised questions about the wisdom of relying on AI tools that could potentially cause extreme harm, even as OpenAI emphasizes that ChatGPT Health is intended as additional support rather than a replacement for medical care.ChatGPT Health operates as a separate experience within the ChatGPT interface, designed to help users navigate scattered health information across provider portals, wearable apps, and personal notes
3
. Users can upload lab results, visit summaries, clinical history, and connect various health-tracking apps to provide the AI with a more complete picture. OpenAI suggests the tool can help people prepare for doctor appointments, plan questions, receive customized diet plans or workout routines, and understand patterns related to their medical profile3
.
Source: The Conversation
The feature includes medical records integration capabilities, though these are currently available only in the United States
4
. OpenAI claims conversations and files are encrypted by default at rest and in transit, with ChatGPT Health building on this foundation with additional layered protections including purpose-built encryption and isolation3
. The company states that Health conversations exist in their own memory space and won't be used for foundation model training3
.When put to the test, ChatGPT Health's ability to personalize health information showed significant flaws. A Washington Post columnist granted the chatbot access to a decade of Apple Watch data—29 million steps and 6 million heartbeat measurements—and asked it to grade his cardiac health. ChatGPT Health assigned him an F grade
5
. After connecting his medical records with weight, blood pressure, and cholesterol data, the grade improved only to a D.Source: Washington Post
Cardiologist Eric Topol of the Scripps Research Institute, an expert on both longevity and AI in medicine, reviewed the analysis and called it "baseless" and "not ready for any medical advice"
5
. The journalist's actual doctor confirmed he was at such low risk for heart attack that insurance likely wouldn't cover additional cardio fitness testing. ChatGPT Health had based much of its negative assessment on Apple Watch's VO2 max estimate—a measurement that independent researchers have found can run low by an average of 13 percent—and heart-rate variability metrics that Topol described as having "lots of fuzziness"5
.Similarly, Anthropic's Claude for Healthcare graded the same individual's cardiac health a C, relying on questionable analysis
5
. Independent research consistently shows generative AI tools sometimes give unsafe health advice, even when they have access to medical records4
.Using ChatGPT Health requires handing over intimate health information to an AI company, raising substantial risks of sharing medical information. While OpenAI encourages users to share sensitive data like medical records, lab results, and wellness app information in exchange for deeper insights
2
, the protections are far from watertight. ChatGPT Health is not a healthcare provider, meaning it isn't covered by HIPAA, the federal health privacy law5
.Sara Gerke, a law professor at the University of Illinois Urbana-Champaign, explains that data protection for AI tools like ChatGPT Health "largely depends on what companies promise in their privacy policies and terms of use" since most states haven't enacted comprehensive privacy laws
2
. Hannah van Kolfschooten, a researcher in digital health law at the University of Basel, notes that while ChatGPT states in current terms that it will keep data confidential and not use it to train models, "you are not protected by law, and it is allowed to change terms of use over time"2
.Carmel Shachar, an assistant clinical professor of law at Harvard Law School, emphasizes the limited protection: "There's very limited protection. Some of it is their word, but they could always go back and change their privacy practices"
2
. The confusion is compounded by OpenAI launching ChatGPT for Healthcare—an enterprise product with stronger protections for hospitals and clinicians—just one day after ChatGPT Health, leading many to mistakenly assume the consumer product has the same level of security2
.Related Stories
Some medical professionals see health chatbots as potentially improving upon the "Dr. Google" era of medical information seeking. Marc Succi, an associate professor at Harvard Medical School and practicing radiologist, notes that treating patients who searched symptoms on Google required "a lot of attacking patient anxiety [and] reducing misinformation," but now he sees patients "asking questions at the level of something an early med student might ask"
1
. The key question is whether Dr. ChatGPT represents an improvement over Dr. Google in terms of reducing medical misinformation and unnecessary health anxiety.However, evaluating effectiveness remains challenging. Danielle Bitterman, clinical lead for data science and AI at Mass General Brigham, states: "It's exceedingly difficult to evaluate an open-ended chatbot"
1
. While large language models score well on medical licensing examinations, those multiple-choice tests don't reflect how people actually use chatbots for health information. When Sirisha Rambhatla, an assistant professor at the University of Waterloo, evaluated GPT-4o on licensing exam questions without multiple-choice options, medical experts scored only about half the responses as entirely correct1
.A different study testing GPT-4o on realistic prompts found it answered medical questions correctly about 85% of the time
1
. Amulya Yadav, who led the study at Pennsylvania State University, noted that human doctors misdiagnose patients 10% to 15% of the time, though he personally remains skeptical of patient-facing medical AI tools.Research from 2024 estimated almost one in ten Australians had asked ChatGPT a health query in the previous six months, with usage more common among people born in non-English speaking countries, those who spoke another language at home, and people with limited health literacy
4
. Among those who hadn't recently used ChatGPT for health, 39% were considering using it soon4
.OpenAI worked with more than 260 clinicians in 60 countries including Australia to provide feedback on ChatGPT Health outputs
4
. However, the tool has not been independently tested, and it remains unclear whether ChatGPT Health would be considered a medical device requiring regulation in Australia4
. The tool's responses may not reflect Australian clinical guidelines or meet the needs of priority populations including First Nations people, those from culturally and linguistically diverse backgrounds, people with disability and chronic conditions, and older adults4
.Health questions requiring clinical expertise to answer carry more risk of serious consequences, including finding out what symptoms mean, asking for advice about treatment, and interpreting test results
4
. Even with access to consumer health data, AI tools demonstrate well-documented tendencies to agree with users and generate hallucination rather than admit ignorance1
. When doctors are unavailable or unable to help, people will turn to alternatives, making the accuracy and safety of these tools a pressing concern for millions seeking diagnosis support and medical guidance online.Summarized by
Navi
[1]
[4]
11 Jan 2026•Technology

05 Jan 2026•Technology

04 Dec 2025•Technology

1
Policy and Regulation

2
Business and Economy

3
Technology
