Curated by THEOUTPOST
On Wed, 12 Mar, 12:07 AM UTC
2 Sources
[1]
Machine learning models fall short in predicting in-hospital mortality
Virginia TechMar 11 2025 It would be greatly beneficial to physicians trying to save lives in intensive care units if they could be alerted when a patient's condition rapidly deteriorates or shows vitals in highly abnormal ranges. While current machine learning models are attempting to achieve that goal, a Virginia Tech study recently published in Communications Medicine shows that they are falling short with models for in-hospital mortality prediction, which refers to predicting the likelihood of a patient dying in the hospital, failing to recognize 66 percent of the injuries. Predictions are only valuable if they can accurately recognize critical patient conditions. They need to be able to identify patients with worsening health conditions and alert doctors promptly." Danfeng "Daphne" Yao, professor in the Department of Computer Science and affiliate faculty at the Sanghani Center for Artificial Intelligence and Data Analytics "Our study found serious deficiencies in the responsiveness of current machine learning models," said Yao. "Most of the models we evaluated cannot recognize critical health events and that poses a major problem." To conduct their research, Yao and computer science Ph.D. student Tanmoy Sarkar Pias collaborated with: Sharmin Afrose, Oak Ridge National Laboratory, Tennessee Moon Das Tuli, Greenlife Medical College Hospital, Dhaka, Bangladesh Ipsita Hamid Trisha, Banner University Medical Center, Tucson, and University of Arizona College of Medicine Xinwei Deng, Department of Statistics at Virginia Tech Charles B. Nemeroff, Department of Psychiatry and Behavioral Sciences, University of Texas at Austin Dell Medical School Their paper, "Low Responsiveness of Machine Learning Models to Critical or Deteriorating Health Conditions," shows patient data is not enough to teach models how to determine future health risks. Calibrating health care models with "test patients" helps reveal the models' true ability and limitations. The team developed multiple medical testing approaches, including a gradient ascent method and neural activation map. Color changes in the neural activation map indicate how well machine learning models react to worsening patient conditions. The gradient ascent method can automatically generate special test cases, making it easier to evaluate the quality of a model. "We systematically assessed machine learning models' ability to respond to serious medical conditions using new test cases, some of which are time series, meaning they use a sequence of observations collected at regular intervals to forecast future values," Pias said. "Guided by medical doctors, our evaluation involved multiple machine learning models, optimization techniques, and four data sets for two clinical prediction tasks." In addition to models failing to recognize 66 percent of injuries for in-hospital mortality prediction, the models failed to generate, in some instances, adequate mortality risk scores for all test cases. The study identified similar deficiencies in the responsiveness of five-year breast and lung cancer prognosis models. These findings inform future health care research using machine learning and artificial intelligence (AI), Yao said, because they show that statistical machine learning models trained solely from patient data are grossly insufficient and have many dangerous blind spots. To diversify training data, one may leverage strategically developed synthetic samples, an approach Yao's team explored in 2022 to enhance prediction fairness for minority patients. "A more fundamental design is to incorporate medical knowledge deeply into clinical machine learning models," she said. "This is highly interdisciplinary work, requiring a large team with both computing and medical expertise." In the meantime, Yao's group is actively testing other medical models, including large language models, for their safety and efficacy in time-sensitive clinical tasks, such as sepsis detection. "AI safety testing is a race against time, as companies are pouring products into the medical space," she said. "Transparent and objective testing is a must. AI testing helps protect people's lives and that's what my group is committed to." Virginia Tech Journal reference: Pias, T. S., et al. (2025). Low responsiveness of machine learning models to critical or deteriorating health conditions. Communications Medicine. doi.org/10.1038/s43856-025-00775-0.
[2]
Machine learning models fail to detect key health deteriorations, research shows
It would be greatly beneficial to physicians trying to save lives in intensive care units if they could be alerted when a patient's condition rapidly deteriorates or shows vitals in highly abnormal ranges. While current machine learning models are attempting to achieve that goal, a Virginia Tech study published in Communications Medicine shows that they are falling short with models for in-hospital mortality prediction, which refers to predicting the likelihood of a patient dying in the hospital, failing to recognize 66% of the injuries. "Predictions are only valuable if they can accurately recognize critical patient conditions. They need to be able to identify patients with worsening health conditions and alert doctors promptly," said Danfeng "Daphne" Yao, professor in the Department of Computer Science and affiliate faculty member at the Sanghani Center for Artificial Intelligence and Data Analytics. "Our study found serious deficiencies in the responsiveness of current machine learning models," said Yao. "Most of the models we evaluated cannot recognize critical health events and that poses a major problem." To conduct their research, Yao and computer science Ph.D. student Tanmoy Sarkar Pias collaborated with a number of researchers. Their paper, "Low Responsiveness of Machine Learning Models to Critical or Deteriorating Health Conditions," shows patient data is not enough to teach models how to determine future health risks. Calibrating health care models with "test patients" helps reveal the models' true ability and limitations. The team developed multiple medical testing approaches, including a gradient ascent method and neural activation map. Color changes in the neural activation map indicate how well machine learning models react to worsening patient conditions. The gradient ascent method can automatically generate special test cases, making it easier to evaluate the quality of a model. "We systematically assessed machine learning models' ability to respond to serious medical conditions using new test cases, some of which are time series, meaning they use a sequence of observations collected at regular intervals to forecast future values," Pias said. "Guided by medical doctors, our evaluation involved multiple machine learning models, optimization techniques, and four data sets for two clinical prediction tasks." In addition to models failing to recognize 66% of injuries for in-hospital mortality prediction, the models failed to generate, in some instances, adequate mortality risk scores for all test cases. The study identified similar deficiencies in the responsiveness of five-year breast and lung cancer prognosis models. These findings inform future health care research using machine learning and artificial intelligence (AI), Yao said, because they show that statistical machine learning models trained solely from patient data are grossly insufficient and have many dangerous blind spots. To diversify training data, one may leverage strategically developed synthetic samples, an approach Yao's team explored in 2022 to enhance prediction fairness for minority patients. "A more fundamental design is to incorporate medical knowledge deeply into clinical machine learning models," she said. "This is highly interdisciplinary work, requiring a large team with both computing and medical expertise." In the meantime, Yao's group is actively testing other medical models, including large language models, for their safety and efficacy in time-sensitive clinical tasks, such as sepsis detection. "AI safety testing is a race against time, as companies are pouring products into the medical space," she said. "Transparent and objective testing is a must. AI testing helps protect people's lives and that's what my group is committed to."
Share
Share
Copy Link
A Virginia Tech study reveals significant shortcomings in current machine learning models for predicting in-hospital mortality, with models failing to recognize 66% of critical health events.
A recent study conducted by Virginia Tech researchers has uncovered significant limitations in current machine learning models used for predicting in-hospital mortality. The research, published in Communications Medicine, reveals that these models fail to recognize 66% of critical health events, raising concerns about their effectiveness in real-world medical settings 12.
The study, led by Professor Danfeng "Daphne" Yao from the Department of Computer Science at Virginia Tech, evaluated multiple machine learning models using various data sets and clinical prediction tasks. The researchers found that:
These findings highlight the potential dangers of relying solely on statistical machine learning models trained on patient data for critical healthcare decisions.
To assess the models' responsiveness, the research team developed innovative testing methods:
These approaches provide a more comprehensive evaluation of model performance and reveal limitations that may not be apparent through traditional testing methods.
The study's results have significant implications for the future of AI and machine learning in healthcare:
Professor Yao's team is actively working on addressing these challenges:
As companies rapidly introduce AI products into the medical field, the researchers stress the critical need for transparent and objective testing:
"AI safety testing is a race against time, as companies are pouring products into the medical space," said Professor Yao. "Transparent and objective testing is a must. AI testing helps protect people's lives and that's what my group is committed to" 12.
This study serves as a crucial reminder of the importance of rigorous testing and evaluation of AI systems in healthcare, where the stakes are often life and death. As machine learning continues to advance, ensuring its reliability and safety in medical applications remains a top priority for researchers and healthcare professionals alike.
Reference
[1]
[2]
Medical Xpress - Medical and Health News
|Machine learning models fail to detect key health deteriorations, research showsA groundbreaking AI-based system has been developed to identify high-risk patients in hospitals, leading to a substantial reduction in mortality rates. This innovative tool has shown promising results in real-world applications, potentially revolutionizing patient care in hospital settings.
2 Sources
2 Sources
AI systems in healthcare, while promising, require significant human resources for implementation and maintenance. This challenges the notion that AI will reduce costs and improve efficiency in medical settings.
5 Sources
5 Sources
Researchers from Florida Atlantic University have developed AI models to predict COVID-19 severity and treatment needs for hospitalized patients, potentially improving patient care and resource allocation during pandemics.
2 Sources
2 Sources
Researchers at Washington University in St. Louis develop a specialized AI model that outperforms traditional methods in predicting postoperative complications by analyzing clinical notes, potentially improving patient outcomes and reducing healthcare costs.
4 Sources
4 Sources
Recent studies highlight the potential of artificial intelligence in medical settings, demonstrating improved diagnostic accuracy and decision-making. However, researchers caution about the need for careful implementation and human oversight.
2 Sources
2 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved