3 Sources
3 Sources
[1]
Patients trust AI's medical advice over doctors - even when it's wrong, study finds
There's a crisis due to a lack of doctors in the US. In the October issue of the prestigious New England Journal of Medicine, Harvard Medical School professor Isaac Kohane described how many large hospitals in Massachusetts, the state with the most doctors per capita, are refusing to admit new patients. The situation is only going to get worse, statistics suggest, wrote Kohane. As a result: "Whether out of desperation, frustration, or curiosity, large numbers of patients are already using AI to obtain medical advice, including second opinions -- sometimes with dramatic therapeutic consequences." Also: Can AI outdiagnose doctors? Microsoft's tool is 4 times better for complex cases The medical community is both interested in and somewhat concerned about the growing tendency for people to seek medical advice from ChatGPT and other generative AI systems. And they ought to be concerned, as it appears people are likely to trust a bot for medical advice more than they trust doctors, including when the medical advice from a bot is of "low quality." In a study published in June in The New England Journal of Medicine, titled, "People Overtrust AI-Generated Medical Advice despite Low Accuracy," Shruthi Shekar and collaborators at MIT's Media Lab, Stanford University, Cornell University, Beth Israel Deaconess Medical Center in Boston, and IBM tested people's responses to medical advice from OpenAI's older GPT-3 model. Shekar and team extracted 150 medical questions from an internet health site, HealthTap, and generated answers to them using GPT-3. A group of doctors was recruited to rate the AI answers for accuracy, assigning each "yes," "no," or "maybe" in terms of correctness. Shekar and team then curated three data sets consisting of 30 question/answer pairs with actual physicians' responses, 30 with "high-accuracy" AI responses, meaning those mostly rated correct by doctors, and 30 with "low-accuracy AI responses, those mostly assigned "no" or "maybe" by doctors. They conducted three experiments. In the first experiment, a group of 100 subjects recruited online from the website Prolific were presented with 10 question/answer pairs randomly selected from the 90, without knowing whether they were from doctors or AI. The researchers asked each person to rate on a scale of 1 to 5 how much they understood the question/response pair, and to rate the certainty that the source of a given pair was a person or AI. Also: This one feature could make GPT-5 a true game changer (if OpenAI gets it right) In a second experiment, a different group of 100 had to rate whether they thought the answers were "valid" and answer multiple-choice questions about whether they would be inclined to follow the medical advice given. However, this group wasn't told any information about doctors versus AI. In the third and final experiment, another group of 100 was given a random sample of 10 questions and asked the same questions. However, this time they were informed at the beginning of the experiment that what they were about to examine was from AI, a doctor, or "a doctor assisted by AI." Also: Stop using AI for these 9 work tasks - here's why The labels were chosen at random, meaning that some questions written by AI might have been thought by subjects to be written by doctors, or by doctors using AI. The authors then analyzed the subjects' performance in each experiment. In experiment one, participants performed poorly when guessing if a question/answer pair was human or AI-sourced, little better than chance, in fact: When participants were asked to determine the source of the medical response provided to them (doctor-written or AI-generated), there was an average source determination accuracy of 50% for doctors' responses, 53% for high-accuracy AI responses, and 50% for low-accuracy AI responses. People are also very confident even when they're wrong. Although they did poorly, Shekar and team reported a high degree of confidence from subjects that their determination of AI or human was accurate. "The level of confidence when participants guessed correctly and incorrectly was not significantly different," they noted. In the second experiment, the subjects judged the AI-generated responses "to be significantly more valid than the doctors' responses," and even the "low-accuracy AI-generated responses performed very comparably with the doctors' responses." Remember, the low-accuracy AI responses were responses that doctors deemed wrong, or at least possibly inaccurate. Also: You can use Google's Math Olympiad-winning Deep Think AI model now - for a price The same thing happened with trustworthiness: subjects said the AI responses were "significantly more trustworthy" than doctors' responses, and they also showed "a relatively equal tendency to follow the advice provided across all three response types," meaning high-quality AI, doctors, and low-quality AI. In the third test, with random labels that suggested a response was from AI, a doctor, or a doctor assisted with AI, the label that suggested the doctor was a source heavily influenced the subjects. "In the presence of the label 'This response to each medical question was given by a %(doctor),' participants tended to rate high-accuracy AI-generated responses as significantly more trustworthy" than when responses were labeled as coming from AI. Even doctors can be fooled, it turns out. In a follow-up test, Shekar and team asked doctors to evaluate the question/answer pairs, both with and without being told which was AI and which wasn't. With labels indicating which was which, the doctors "evaluated the AI-generated responses as significantly lower in accuracy." When they didn't know the source, "there was no significant difference in their evaluation in terms of accuracy," which, the authors write, shows that doctors have their own biases. Also: Even OpenAI CEO Sam Altman thinks you shouldn't trust AI for therapy In sum, people, even doctors, can't tell AI from a human when it comes to medical advice, and, on average, lay people are inclined to trust AI responses more than doctors, even when the AI responses are of low quality, meaning, even when the advice is wrong, and even more so if they are led to believe the response is actually from a doctor. Shekar and team see a big concern in all this: Participants' inability to differentiate between the quality of AI-generated responses and doctors' responses, regardless of accuracy, combined with their high evaluation of low-accuracy AI responses, which were deemed comparable with, if not superior to, doctors' responses, presents a concerning threat [...] a dangerous scenario where inaccurate AI medical advice might be deemed as trustworthy as a doctor's response. When unaware of the response's source, participants are willing to trust, be satisfied, and even act upon advice provided in AI-generated responses, similarly to how they would respond to advice given by a doctor, even when the AI-generated response includes inaccurate information. Shekar and team conclude that "expert oversight is crucial to maximize AI's unique capabilities while minimizing risks," including transparency about where advice is coming from. The results also mean that "integrating AI into medical information delivery requires a more nuanced approach than previously considered." However, the conclusions are made more complicated as, ironically, the people in the third experiment were less favorable if they thought a response was coming from a doctor "assisted by AI," a fact that complicates "the ideal solution of combining AI's comprehensive responses with physician trust," they write. To be sure, there is evidence that bots can be helpful in tasks such as diagnosis when used by doctors. A study in the scholarly journal Nature Medicine in December, conducted by researchers at the Stanford Center for Biomedical Informatics Research at Stanford University, and collaborating institutions, tested how physicians fared in diagnosing conditions in a simulated setting, meaning, not with real patients, using either the help of GPT-4 or traditional physicians' resources. The study was very positive for AI. "Physicians using the LLM scored significantly higher compared to those using conventional resources," wrote lead author Ethan Goh and team. Also: Google upgrades AI Mode with Canvas and 3 other new features - how to try them Putting the research together, if people tend to trust AI, and if AI has been shown to help doctors in some cases, the next stage might be for the entire field of medicine to grapple with how AI can help or hurt in practice. As Harvard professor Kohane argues in his opinion piece, what is ultimately at stake is the quality of care and whether AI can or cannot help. "In the case of AI, shouldn't we be comparing health outcomes achieved with patients' use of these programs with outcomes in our current primary-care-doctor-depleted system?"
[2]
ChatGPT in Your Clinic: Who's the Expert Now
Patients arriving at appointments with researched information is not new, but artificial intelligence (AI) tools such as ChatGPT are changing the dynamics. Their confident presentation can leave physicians feeling that their expertise is challenged. Kumara Raja Sundar, MD, a family medicine physician at Kaiser Permanente Burien Medical Center in Burien, Washington, highlighted this trend in a recent article published in JAMA. A patient visited Sundar's clinic reporting dizziness and described her symptoms with unusual precision: "It's not vertigo, but more like a presyncope feeling." She then suggested that the tilt table test might be useful for diagnosis. Occasionally, patient questions reveal subtle familiarity with medical jargon. This may indicate that they either have relevant training or have studied the subject extensively. Curious, Sundar asked if she worked in the healthcare sector. She replied that she had consulted ChatGPT, which recommended the tilt table test. For years, patients have brought newspaper clippings, internet research, and advice from friends and relatives to consultations. Suggestions shared in WhatsApp groups have become a regular part of clinical discussions. Sundar noted that this particular encounter was different. The patient's tone and level of detail conveyed competence, and the confidence with which she presented the information subtly challenged his clinical judgment and treatment plans. It is not surprising that large language models (LLMs), such as ChatGPT, are appealing. Recent studies have confirmed their remarkable strengths in logical reasoning and interpersonal communication. However, a direct comparison between LLMs and physicians is unfair. Clinicians often face immense pressure, including constrained consultation times, overflowing inboxes, and a healthcare system that demands productivity and efficiency. Even skilled professionals struggle to perform optimally under adverse conditions. In contrast, generative AI is functionally limitless. This imbalance creates an unrealistic benchmark; however, this is today's reality. Patients want clear answers; more importantly, they want to feel heard, understood, and reassured. Patients value accurate information but also want to feel recognized, reassured, and heard. "Unfortunately, under the weight of competing demands, which is what often slips for me not just because of systemic constraints but also because I am merely human," Sundar wrote. Despite the capabilities of generative AI, patients still visit doctors. Though these tools deliver confidently worded suggestions, they inevitably conclude: "Consult a healthcare professional." The ultimate responsibility for liability, diagnostics, prescriptions, and sick notes remains with physicians. In practice, this means dealing with requests, such as a tilt table test for intermittent vertigo, a procedure that is not uncommon but often inappropriate. "I find myself explaining concepts such as overdiagnosis, false positives, or other risks of unnecessary testing. At best, the patient understands the ideas, which may not resonate when one is experiencing symptoms. At worst, I sound dismissive. There is no function that tells ChatGPT that clinicians lack routine access to tilt-table testing or that echocardiogram appointments are delayed because of staff shortages. I have to carry those constraints into the examination room while still trying to preserve trust," Sundar emphasized in his article. When I speak with medical students, I notice a different kind of paternalism creeping in. And I have caught it in my inner monologue, even if I do not say it aloud. The old line, "They probably WebMD'd it and think they have cancer," has morphed into the newer, just-as-dismissive line, "They probably ChatGPT'd it and are going to tell us what to order." It often reflects defensiveness from clinicians rather than genuine engagement and carries an implicit message: We still know the best. "It is an attitude that risks eroding sacred and fragile trust between clinicians and patients. It reinforces the feeling that we are not 'in it' with our patients and are truly gatekeeping rather than partnering. Ironically, that is often why I hear patients turn to LLMs in the first place," Sundar concluded. One patient said plainly, "This is how I can advocate for myself better." The word "advocate" struck Sundar, capturing the effort required to persuade someone with more authority. Although clinicians still control access to tests, referrals, and treatment plans, the term conveys a sense of preparing for a fight. When patients feel unheard, gathering knowledge becomes a strategy to be taken seriously. In such situations, the usual approach of explaining false-positive test results, overdiagnosis, and test characteristics is often ineffective. From the patient's perspective, this sounds more like, "I still know more than you, no matter what tool you used, and I'm going to overwhelm you with things you don't understand." The role of physicians is constantly evolving. The transition from physician-as-authority to physician-as-advisor is intensifying. Patients increasingly present with expectations shaped by nonevidence-based sources, often misaligned with the clinical reality. As Sundar observed, "They arm themselves with knowledge to be heard." This necessitates a professional duty to respond with understanding rather than resistance. His approach centers on emotional acknowledgment before clinical discussion: "I say, 'We'll discuss diagnostic options together. But first, I want to express my condolences. I can hardly imagine how you feel. I want to tackle this with you and develop a plan.'" He emphasized, "This acknowledgment was the real door opener." What began as a US trend observed by Sundar has now spread worldwide, with patients increasingly arriving at consultations armed with medical knowledge from tools like ChatGPT rather than just "Dr Google." Clinicians across health systems have reported that digitally informed patients now comprise the majority. In a forum discussion, physicians from various disciplines shared their experiences, highlighting how previously informed patients are now the norm. Inquiries often focus on specific laboratory values, particularly vitamin D or hormone tests. In gynecologic consultations, Internet research on menstrual disorders has become a routine part of patient interactions, with an overwhelming range of answers available online. 'Chanice,' a Coliquio user who's a gynecologist, shared, "The answers range from, 'It's normal; it can happen' to 'You won't live long.'" "It's also common to Google medication side effects, and usually, women end up experiencing pretty much every side effect, even though they didn't have them before." How should doctors respond to this trend? Opinions are clear: openness, education, and transparency are essential and ideally delivered in a structured manner. "Get the patients on board; educate them. In writing! Each and every one of them. Once it's put into words, it's no longer a job. Invest time in educating patients to correct misleading promises made by health insurance companies and politicians," commented another user, Jörg Christian Nast, a specialist in gynecology and obstetrics. The presence of digitally informed patients is increasingly seen not only as a challenge but also as an opportunity. Conversations with these patients can be constructive, but they can also generate unrealistic demands or heated debates. Thus, a professional, calm, and explanatory approach remains crucial, and at times, a dose of humor can help. Another user who specializes in internal medicine added, "The term 'online consultation' takes on a whole new meaning." The full forum discussion, "The Most Frequently Asked 'Dr. Google' Questions," can be found here. Find out what young physicians think about AI and the evolving doctor-patient relationship in our interview with Christian Becker, MD, MHBA, University Medical Center Göttingen, Göttingen, Germany, and a spokesperson for the Young German Society for Internal Medicine.
[3]
Patients are bringing AI diagnoses and prescriptions to clinics: What does it mean for doctors?
Artificial intelligence is changing healthcare. Patients now use AI for diagnoses, sometimes challenging doctors. This creates pressure and trust issues. Doctors must address patient concerns and avoid defensiveness. A recent case showed AI giving dangerous advice, leading to hospitalization. Experts call for transparency and patient education. AI offers information, but lacks medical judgment. For years, patients have walked into clinics carrying clippings from newspapers, advice from friends, or the latest findings from WhatsApp groups. Today, they arrive with something far more sophisticated: a neatly packaged diagnosis or even a prescription generated by artificial intelligence. According to a recent Medscape report, this trend is rapidly reshaping the dynamics of clinical practice. Dr. Kumara Raja Sundar, a family physician at Kaiser Permanente Burien Medical Center in Washington, described one such case in JAMA. A patient presented with dizziness and, with striking medical precision, said, "It's not vertigo, but more like a presyncope feeling." She confidently suggested a tilt table test for diagnosis. Intrigued, Sundar asked if she worked in healthcare. Her reply: she had asked ChatGPT. What stood out was not just the information but the confidence with which it was delivered, subtly challenging the physician's role as the sole authority. Large language models such as ChatGPT have demonstrated impressive reasoning and communication abilities, but comparing them to doctors is problematic. Physicians juggle limited consultation time, staff shortages, and systemic pressures. AI, by contrast, appears limitless. Sundar observed in his article that this imbalance creates unrealistic expectations: "Unfortunately, under the weight of competing demands, what often slips for me is not accuracy, but making patients feel heard." The arrival of AI-informed patients brings practical challenges. Requests for advanced or unnecessary tests, such as tilt table examinations or hormone panels, often collide with real-world constraints like delayed appointments or limited access. Sundar wrote that explaining overdiagnosis and false positives can sometimes sound dismissive rather than collaborative, further straining trust. The shift, he warns, risks fostering a new kind of defensiveness among clinicians: the quiet thought that a patient has "ChatGPT'd it" before walking into the room. Such attitudes, he argued, risk eroding fragile doctor-patient trust. For some patients, AI tools are more than information sources; they are instruments of advocacy. One patient told Sundar, "This is how I can advocate for myself better." The language of advocacy reflects the effort required to be taken seriously in clinical spaces. Doctors, he emphasized, must resist gatekeeping and instead acknowledge patients' concerns before moving to clinical reasoning. His preferred approach is to begin with empathy: "I want to express my condolences. I can hardly imagine how you feel. I want to tackle this with you and develop a plan." What Sundar has seen in the United States is not unique. The Medscape report highlights that doctors worldwide now face AI-informed patients as the norm rather than the exception. In Germany, gynecologists report women consulting ChatGPT for menstrual disorders, often encountering contradictory or alarming answers. Specialists in internal medicine note that Googling side effects leads patients to experience nearly all of them -- even when they had none before. Clinicians responding in online forums have called for transparency, structured patient education, and even humor as tools for navigating this new reality. One remarked that "online consultation takes on a whole new meaning" when AI walks into the room with the patient. The blurred line between helpful guidance and hazardous misinformation was recently illustrated in a striking case reported in the Annals of Internal Medicine in August', 2025. A 60-year-old man who wanted to cut down on table salt turned to ChatGPT for alternatives. The chatbot recommended sodium bromide, a compound more familiar in swimming pool maintenance than in home kitchens. Trusting the advice, he used the substance for several months until he landed in the hospital with paranoia, hallucinations, and severe electrolyte imbalances. Doctors diagnosed bromism, a condition rarely seen since the early 20th century, when bromide salts were once widely prescribed. Physicians treating the man noted bromide levels more than 200 times the safe reference range, explaining his psychiatric and neurological decline. After intensive fluid therapy and correction of electrolytes, he recovered, but only after a three-week hospital stay. The case, is a reminder that medical judgment requires not just knowledge, but also context and responsibility -- qualities AI does not yet possess.
Share
Share
Copy Link
A study reveals patients' increasing reliance on AI for medical advice, often trusting it over doctors. This trend is reshaping doctor-patient dynamics and raising concerns about AI's limitations in healthcare.
A recent study published in The New England Journal of Medicine has revealed a surprising trend: patients are increasingly trusting AI-generated medical advice over that of human doctors, even when the AI's advice is of low quality
1
. This shift is reshaping the landscape of healthcare and raising concerns among medical professionals.Source: Medscape
Researchers from MIT's Media Lab, Stanford University, Cornell University, and other institutions conducted a series of experiments to test people's responses to medical advice from OpenAI's GPT-3 model compared to that from human doctors
1
. The results were striking:The rise of AI in healthcare is changing the dynamics of doctor-patient interactions. Dr. Kumara Raja Sundar, a family medicine physician, shared his experience in JAMA, describing how patients now arrive at appointments with AI-generated diagnoses and treatment suggestions
2
.This trend is creating new challenges for healthcare providers:
Related Stories
Source: Economic Times
While AI tools like ChatGPT offer patients access to vast amounts of medical information, they also present significant risks:
3
.Medical professionals are calling for new approaches to address this shift:
Dr. Sundar suggests acknowledging patients' concerns before moving to clinical reasoning: "I want to express my condolences. I can hardly imagine how you feel. I want to tackle this with you and develop a plan"
2
.As AI continues to play a larger role in healthcare, finding the right balance between leveraging its benefits and mitigating its risks will be crucial for both patients and healthcare providers. The challenge lies in harnessing AI's potential while maintaining the irreplaceable human elements of medical care: judgment, empathy, and responsibility.
Summarized by
Navi
[2]
1
Business and Economy
2
Technology
3
Business and Economy