2 Sources
[1]
Can Chatbots Spot Mental Health Drug Side Effects? - Neuroscience News
Summary: As mental healthcare gaps persist, people increasingly turn to AI chatbots for help with psychiatric medication side effects. A new study evaluated how well large language models detect and respond to these complex, high-risk situations. While AI often mirrors a psychiatrist's tone, researchers found it struggles with accurately identifying adverse drug reactions and offering actionable advice. The study highlights the need for safer, more effective chatbots tailored to mental health needs. Asking artificial intelligence for advice can be tempting. Powered by large language models (LLMs), AI chatbots are available 24/7, are often free to use, and draw on troves of data to answer questions. Now, people with mental health conditions are asking AI for advice when experiencing potential side effects of psychiatric medicines -- a decidedly higher-risk situation than asking it to summarize a report. One question puzzling the AI research community is how AI performs when asked about mental health emergencies. Globally, including in the U.S., there is a significant gap in mental health treatment, with many individuals having limited to no access to mental healthcare. It's no surprise that people have started turning to AI chatbots with urgent health-related questions. Now, researchers at the Georgia Institute of Technology have developed a new framework to evaluate how well AI chatbots can detect potential adverse drug reactions in chat conversations, and how closely their advice aligns with human experts. The study was led by Munmun De Choudhury, J.Z. Liang Associate Professor in the School of Interactive Computing, and Mohit Chandra, a third-year computer science Ph.D. student. De Choudhury is also a faculty member in the Georgia Tech Institute for People and Technology. "People use AI chatbots for anything and everything," said Chandra, the study's first author. "When people have limited access to healthcare providers, they are increasingly likely to turn to AI agents to make sense of what's happening to them and what they can do to address their problem. "We were curious how these tools would fare, given that mental health scenarios can be very subjective and nuanced." De Choudhury, Chandra, and their colleagues introduced their new framework at the 2025 Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics on April 29, 2025. Putting AI to the Test Going into their research, De Choudhury and Chandra wanted to answer two main questions: First, can AI chatbots accurately detect whether someone is having side effects or adverse reactions to medication? Second, if they can accurately detect these scenarios, can AI agents then recommend good strategies or action plans to mitigate or reduce harm? The researchers collaborated with a team of psychiatrists and psychiatry students to establish clinically accurate answers from a human perspective and used those to analyze AI responses. To build their dataset, they went to the internet's public square, Reddit, where many have gone for years to ask questions about medication and side effects. They evaluated nine LLMs, including general purpose models (such as GPT-4o and LLama-3.1), and specialized medical models trained on medical data. Using the evaluation criteria provided by the psychiatrists, they computed how precise the LLMs were in detecting adverse reactions and correctly categorizing the types of adverse reactions caused by psychiatric medications. Additionally, they prompted LLMs to generate answers to queries posted on Reddit and compared the alignment of LLM answers with those provided by the clinicians over four criteria: (1) emotion and tone expressed, (2) answer readability, (3) proposed harm-reduction strategies, and (4) actionability of the proposed strategies. The research team found that LLMs stumble when comprehending the nuances of an adverse drug reaction and distinguishing different types of side effects. They also discovered that while LLMs sounded like human psychiatrists in their tones and emotions -- such as being helpful and polite -- they had difficulty providing true, actionable advice aligned with the experts. Better Bots, Better Outcomes The team's findings could help AI developers build safer, more effective chatbots. Chandra's ultimate goals are to inform policymakers of the importance of accurate chatbots and help researchers and developers improve LLMs by making their advice more actionable and personalized. Chandra notes that improving AI for psychiatric and mental health concerns would be particularly life-changing for communities that lack access to mental healthcare. "When you look at populations with little or no access to mental healthcare, these models are incredible tools for people to use in their daily lives," Chandra said. "They are always available, they can explain complex things in your native language, and they become a great option to go to for your queries. "When the AI gives you incorrect information by mistake, it could have serious implications on real life," Chandra added. "Studies like this are important, because they help reveal the shortcomings of LLMs and identify where we can improve." Citation: Lived Experience Not Found: LLMs Struggle to Align with Experts on Addressing Adverse Drug Reactions from Psychiatric Medication Use, (Chandra et al., NAACL 2025). Funding: National Science Foundation (NSF), American Foundation for Suicide Prevention (AFSP), Microsoft Accelerate Foundation Models Research grant program. The findings, interpretations, and conclusions of this paper are those of the authors and do not represent the official views of NSF, AFSP, or Microsoft. Author: Catherine Barzler Source: Georgia Institute of Technology Contact: Catherine Barzler - Georgia Institute of Technology Image: The image is credited to Neuroscience News Original Research: The findings will be presented at the 2025 Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics
[2]
AI Chatbots Aren't Experts on Psych Med Reactions -- Yet | Newswise
Newswise -- Asking artificial intelligence for advice can be tempting. Powered by large language models (LLMs), AI chatbots are available 24/7, are often free to use, and draw on troves of data to answer questions. Now, people with mental health conditions are asking AI for advice when experiencing potential side effects of psychiatric medicines -- a decidedly higher-risk situation than asking it to summarize a report. One question puzzling the AI research community is how AI performs when asked about mental health emergencies. Globally, including in the U.S., there is a significant gap in mental health treatment, with many individuals having limited to no access to mental healthcare. It's no surprise that people have started turning to AI chatbots with urgent health-related questions. Now, researchers at the Georgia Institute of Technology have developed a new framework to evaluate how well AI chatbots can detect potential adverse drug reactions in chat conversations, and how closely their advice aligns with human experts. The study was led by Munmun De Choudhury, J.Z. Liang Associate Professor in the School of Interactive Computing, and Mohit Chandra, a third-year computer science Ph.D. student. De Choudhury is also a faculty member in the Georgia Tech Institute for People and Technology. "People use AI chatbots for anything and everything," said Chandra, the study's first author. "When people have limited access to healthcare providers, they are increasingly likely to turn to AI agents to make sense of what's happening to them and what they can do to address their problem. We were curious how these tools would fare, given that mental health scenarios can be very subjective and nuanced." De Choudhury, Chandra, and their colleagues introduced their new framework at the 2025 Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics on April 29, 2025. Putting AI to the Test Going into their research, De Choudhury and Chandra wanted to answer two main questions: First, can AI chatbots accurately detect whether someone is having side effects or adverse reactions to medication? Second, if they can accurately detect these scenarios, can AI agents then recommend good strategies or action plans to mitigate or reduce harm? The researchers collaborated with a team of psychiatrists and psychiatry students to establish clinically accurate answers from a human perspective and used those to analyze AI responses. To build their dataset, they went to the internet's public square, Reddit, where many have gone for years to ask questions about medication and side effects. They evaluated nine LLMs, including general purpose models (such as GPT-4o and LLama-3.1), and specialized medical models trained on medical data. Using the evaluation criteria provided by the psychiatrists, they computed how precise the LLMs were in detecting adverse reactions and correctly categorizing the types of adverse reactions caused by psychiatric medications. Additionally, they prompted LLMs to generate answers to queries posted on Reddit and compared the alignment of LLM answers with those provided by the clinicians over four criteria: (1) emotion and tone expressed, (2) answer readability, (3) proposed harm-reduction strategies, and (4) actionability of the proposed strategies. The research team found that LLMs stumble when comprehending the nuances of an adverse drug reaction and distinguishing different types of side effects. They also discovered that while LLMs sounded like human psychiatrists in their tones and emotions -- such as being helpful and polite -- they had difficulty providing true, actionable advice aligned with the experts. Better Bots, Better Outcomes The team's findings could help AI developers build safer, more effective chatbots. Chandra's ultimate goals are to inform policymakers of the importance of accurate chatbots and help researchers and developers improve LLMs by making their advice more actionable and personalized. Chandra notes that improving AI for psychiatric and mental health concerns would be particularly life-changing for communities that lack access to mental healthcare. "When you look at populations with little or no access to mental healthcare, these models are incredible tools for people to use in their daily lives," Chandra said. "They are always available, they can explain complex things in your native language, and they become a great option to go to for your queries. "When the AI gives you incorrect information by mistake, it could have serious implications on real life," Chandra added. "Studies like this are important, because they help reveal the shortcomings of LLMs and identify where we can improve." Citation: Lived Experience Not Found: LLMs Struggle to Align with Experts on Addressing Adverse Drug Reactions from Psychiatric Medication Use, (Chandra et al., NAACL 2025). Funding: National Science Foundation (NSF), American Foundation for Suicide Prevention (AFSP), Microsoft Accelerate Foundation Models Research grant program. The findings, interpretations, and conclusions of this paper are those of the authors and do not represent the official views of NSF, AFSP, or Microsoft.
Share
Copy Link
A new study by Georgia Tech researchers reveals that AI chatbots struggle to accurately identify and provide actionable advice for psychiatric medication side effects, highlighting the need for improved AI models in mental healthcare.
Researchers at the Georgia Institute of Technology have conducted a groundbreaking study to assess the capabilities of AI chatbots in detecting and responding to potential side effects of psychiatric medications. As mental healthcare gaps persist globally, including in the United States, people are increasingly turning to AI for guidance on urgent health-related questions 12.
Source: Neuroscience News
The research, led by Munmun De Choudhury and Mohit Chandra, aimed to answer two critical questions:
To evaluate this, the team developed a new framework and collaborated with psychiatrists and psychiatry students to establish clinically accurate baselines. They analyzed nine large language models (LLMs), including general-purpose models like GPT-4o and LLama-3.1, as well as specialized medical models 12.
The study revealed several important insights:
Detection Accuracy: LLMs struggled to comprehend the nuances of adverse drug reactions and distinguish between different types of side effects 12.
Tone and Emotion: AI chatbots successfully mirrored the helpful and polite tone of human psychiatrists 12.
Actionable Advice: Despite sounding professional, the AI models had difficulty providing true, actionable advice that aligned with expert recommendations 12.
Evaluation Criteria: The researchers assessed the AI responses based on four criteria: emotion and tone, answer readability, proposed harm-reduction strategies, and actionability of the proposed strategies 12.
The findings of this study have significant implications for the development of AI in mental healthcare:
Improving AI Models: The research highlights the need for safer and more effective chatbots tailored to mental health needs 1.
Addressing Healthcare Gaps: Enhanced AI tools could be particularly beneficial for communities with limited access to mental healthcare resources 2.
Policy Implications: The study aims to inform policymakers about the importance of accurate AI chatbots in healthcare 2.
Potential Risks: The researchers emphasize the serious implications of AI providing incorrect information in mental health contexts 12.
While AI chatbots show promise in mimicking the tone and approachability of mental health professionals, they currently fall short in providing accurate and actionable advice for psychiatric medication side effects. This study underscores the need for continued research and development to improve AI capabilities in mental healthcare, potentially offering a valuable resource for underserved communities while emphasizing the importance of human expertise in this critical field.
Summarized by
Navi
[1]
Apple is reportedly in talks with OpenAI and Anthropic to potentially use their AI models to power an updated version of Siri, marking a significant shift in the company's AI strategy.
29 Sources
Technology
23 hrs ago
29 Sources
Technology
23 hrs ago
Cloudflare introduces a new tool allowing website owners to charge AI companies for content scraping, aiming to balance content creation and AI innovation.
10 Sources
Technology
7 hrs ago
10 Sources
Technology
7 hrs ago
Elon Musk's AI company, xAI, has raised $10 billion in a combination of debt and equity financing, signaling a major expansion in AI infrastructure and development amid fierce industry competition.
5 Sources
Business and Economy
15 hrs ago
5 Sources
Business and Economy
15 hrs ago
Google announces a major expansion of AI tools for education, including Gemini for Education and NotebookLM, aimed at enhancing learning experiences for students and supporting educators in classroom management.
8 Sources
Technology
23 hrs ago
8 Sources
Technology
23 hrs ago
NVIDIA's upcoming GB300 Blackwell Ultra AI servers, slated for release in the second half of 2025, are poised to become the most powerful AI servers globally. Major Taiwanese manufacturers are vying for production orders, with Foxconn securing the largest share.
2 Sources
Technology
15 hrs ago
2 Sources
Technology
15 hrs ago