Curated by THEOUTPOST
On Fri, 21 Feb, 8:02 AM UTC
2 Sources
[1]
AI chatbots are mostly correct, but incomplete, on endometriosis | Newswise
DALLAS - Feb. 20, 2025 - Three of the leading chatbots can provide basic information about endometriosis, a painful gynecologic condition that affects up to 1 in 10 women, but their responses are not as comprehensive as the guidance from health care providers, according to a study by UT Southwestern Medical Center researchers. Their findings, published in the American Journal of Obstetrics and Gynecology, sound a cautionary note for patients who turn to generative artificial intelligence (AI) for medical information. "We did this study because we wanted to know what patients are learning from these chatbots. Is it accurate? Is it reliable? Is it aligning with updated clinical recommendations and what we know from current research?" asked study leader Kimberly Kho, M.D., Professor of Obstetrics and Gynecology at UT Southwestern. "Our results affirm that responses from a chatbot cannot replace a proper evaluation and management by skilled experts for this and other diseases." AI chatbots have attracted significant attention since OpenAI's release of ChatGPT in November 2022. Several other chatbots use a similar large language model, including Claude (developed by Anthropic) and Gemini (developed by Google and formerly known as Bard). Each of these chatbots generates responses developed from a wealth of publicly available data. Over the last few years, they have permeated many industries, including medicine. Patients are increasingly turning to chatbots for medical information, either directly or through their incorporation into search engines, such as Google. However, the quality of answers delivered by these sources has been unclear, Dr. Kho explained. Studies designed to evaluate their output have largely focused on information about cancer, she added, while benign gynecologic conditions haven't been well explored. These include endometriosis, a common disease in which tissue similar to the uterine lining grows outside the uterus, often causing pain, inflammation, and infertility. To determine how well popular chatbots answer questions about endometriosis, Dr. Kho and her colleagues collected answers from ChatGPT-4, Claude, and Gemini after posing 10 questions patients often ask about this disease. Examples include: "What is endometriosis?" "How common is endometriosis?" and "How is endometriosis treated?" They then asked nine board-certified gynecologists to rate the accuracy and completeness of the answers based on current evidence-based guidelines. The medical experts found that answers generated by all three chatbots were mostly accurate, with more correct answers about symptoms and disease processes than about treatment or risk of recurrence. However, Dr. Kho said, the physicians determined that some answers were incomplete. This inadequacy might be due to several factors, she explained, including a lack of patient-specific context in the questions, not enough chatbot training data reflecting the most recent advances in clinical practice, and a lack of consensus among experts in the field. Among the three chatbots studied, ChatGPT delivered the most comprehensive and correct responses. Based on these results, Dr. Kho said, chatbots could serve as a useful starting point for medical information, but patients should still see their physicians to address questions and concerns. Medical experts need to be consulted and involved in the quality control process for health care-specific chatbots currently in development, she added. Dr. Kho holds the Helen J. and Robert S. Strauss and Diana K. and Richard C. Strauss Chair in Women's Health. Other UTSW researchers who contributed to this study include first author Natalie D. Cohen, M.D., Assistant Instructor of Obstetrics and Gynecology; Donald McIntire, Ph.D., Professor of Obstetrics and Gynecology; Katherine Smith, M.D., Assistant Professor of Obstetrics and Gynecology; and Milan Ho, B.S., medical student. Related Stories UT Southwestern, one of the nation's premier academic medical centers, integrates pioneering biomedical research with exceptional clinical care and education. The institution's faculty members have received six Nobel Prizes and include 25 members of the National Academy of Sciences, 23 members of the National Academy of Medicine, and 14 Howard Hughes Medical Institute Investigators. The full-time faculty of more than 3,200 is responsible for groundbreaking medical advances and is committed to translating science-driven research quickly to new clinical treatments. UT Southwestern physicians provide care in more than 80 specialties to more than 120,000 hospitalized patients, more than 360,000 emergency room cases, and oversee nearly 5 million outpatient visits a year.
[2]
AI chatbots are mostly correct, but incomplete, on endometriosis
Three of the leading chatbots can provide basic information about endometriosis, a painful gynecologic condition that affects up to 1 in 10 women, but their responses are not as comprehensive as the guidance from health care providers, according to a study by UT Southwestern Medical Center researchers. Their findings, published in AJOG Global Reports, sound a cautionary note for patients who turn to generative artificial intelligence (AI) for medical information. "We did this study because we wanted to know what patients are learning from these chatbots. Is it accurate? Is it reliable? Is it aligning with updated clinical recommendations and what we know from current research?" asked study leader Kimberly Kho, M.D., Professor of Obstetrics and Gynecology at UT Southwestern. "Our results affirm that responses from a chatbot cannot replace a proper evaluation and management by skilled experts for this and other diseases." AI chatbots have attracted significant attention since OpenAI's release of ChatGPT in November 2022. Several other chatbots use a similar large language model, including Claude (developed by Anthropic) and Gemini (developed by Google and formerly known as Bard). Each of these chatbots generates responses developed from a wealth of publicly available data. Over the last few years, they have permeated many industries, including medicine. Patients are increasingly turning to chatbots for medical information, either directly or through their incorporation into search engines, such as Google. However, the quality of answers delivered by these sources has been unclear, Dr. Kho explained. Studies designed to evaluate their output have largely focused on information about cancer, she added, while benign gynecologic conditions haven't been well explored. These include endometriosis, a common disease in which tissue similar to the uterine lining grows outside the uterus, often causing pain, inflammation, and infertility. To determine how well popular chatbots answer questions about endometriosis, Dr. Kho and her colleagues collected answers from ChatGPT-4, Claude, and Gemini after posing 10 questions patients often ask about this disease. Examples include: "What is endometriosis?" "How common is endometriosis?" and "How is endometriosis treated?" They then asked nine board-certified gynecologists to rate the accuracy and completeness of the answers based on current evidence-based guidelines. The medical experts found that answers generated by all three chatbots were mostly accurate, with more correct answers about symptoms and disease processes than about treatment or risk of recurrence. However, Dr. Kho said, the physicians determined that some answers were incomplete. This inadequacy might be due to several factors, she explained, including a lack of patient-specific context in the questions, not enough chatbot training data reflecting the most recent advances in clinical practice, and a lack of consensus among experts in the field. Among the three chatbots studied, ChatGPT delivered the most comprehensive and correct responses. Based on these results, Dr. Kho said chatbots could serve as a useful starting point for medical information, but patients should still see their physicians to address questions and concerns. Medical experts need to be consulted and involved in the quality control process for health care-specific chatbots currently in development, she added.
Share
Share
Copy Link
A UT Southwestern Medical Center study reveals that leading AI chatbots offer generally correct but not comprehensive information about endometriosis, highlighting the need for expert medical consultation.
Researchers at UT Southwestern Medical Center have conducted a study to assess the accuracy and completeness of information provided by leading AI chatbots on endometriosis, a common gynecologic condition affecting up to 1 in 10 women. The study, published in the American Journal of Obstetrics and Gynecology, aimed to understand what patients are learning from these AI tools and how it compares to guidance from healthcare providers 1.
The research team, led by Dr. Kimberly Kho, Professor of Obstetrics and Gynecology, evaluated responses from three prominent AI chatbots: ChatGPT-4, Claude, and Gemini. They posed 10 common questions about endometriosis to these chatbots and had nine board-certified gynecologists rate the accuracy and completeness of the answers based on current evidence-based guidelines 2.
Key findings of the study include:
While the AI chatbots demonstrated a capacity to provide generally accurate information, the study highlighted several limitations:
Dr. Kho emphasized that "responses from a chatbot cannot replace a proper evaluation and management by skilled experts for this and other diseases" 1.
The study's findings have important implications for both patients and the medical community:
As AI continues to permeate various industries, including medicine, the quality of AI-generated medical information remains a critical area of study. This research underscores the importance of ongoing evaluation and improvement of AI tools in healthcare, particularly as patients increasingly turn to these sources for medical information 2.
The study serves as a cautionary note, highlighting the need for a balanced approach that leverages the potential of AI while recognizing its current limitations in providing comprehensive medical guidance.
Reference
[2]
Medical Xpress - Medical and Health News
|AI chatbots are mostly correct, but incomplete, on endometriosisA new study reveals that AI-powered chatbots can improve physicians' clinical management reasoning, outperforming doctors using conventional resources and matching the performance of standalone AI in complex medical decision-making scenarios.
3 Sources
3 Sources
A recent study reveals that ChatGPT, when used alone, significantly outperformed both human doctors and doctors using AI assistance in diagnosing medical conditions, raising questions about the future of AI in healthcare.
6 Sources
6 Sources
A new study published in BMJ Quality & Safety cautions against using AI-powered search engines and chatbots for drug information, citing inaccuracies and potential harm to patients.
2 Sources
2 Sources
Software developers are exploring the use of AI chatbots for medical advice, raising questions about accuracy and potential risks. While these tools show promise, experts caution against relying solely on AI for healthcare decisions.
2 Sources
2 Sources
A new study reveals that while AI models perform well on standardized medical tests, they face significant challenges in simulating real-world doctor-patient conversations, raising concerns about their readiness for clinical deployment.
3 Sources
3 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved