AI-Assisted Genomic Studies Face Persistent Problems, Warn UW-Madison Researchers

3 Sources

Share

University of Wisconsin-Madison researchers caution about flawed conclusions in AI-assisted genome-wide association studies, highlighting risks of false positives and proposing new methods to improve accuracy.

News article

AI Tools in Genomic Studies Lead to Flawed Conclusions

Researchers from the University of Wisconsin-Madison have raised concerns about the use of artificial intelligence (AI) tools in genetics and medicine, warning that they can lead to erroneous conclusions about the relationship between genes and physical characteristics, including disease risk factors

1

2

3

. The study, published in Nature Genetics, focuses on the problems arising from AI-assisted genome-wide association studies.

The Complexity of Genetic-Trait Relationships

Genome-wide association studies scan through hundreds of thousands of genetic variations across large populations to identify links between genes and physical traits, particularly focusing on connections to certain diseases. While some genetic changes directly correlate with increased risk for diseases like cystic fibrosis, the relationship between genetics and physical traits is often more intricate

1

2

.

Data Gaps and AI Solutions

Large databases like the National Institutes of Health's All of Us project and the UK Biobank often lack comprehensive data on specific health conditions. Qiongshi Lu, an associate professor at UW-Madison, explains:

"Some characteristics are either very expensive or labor-intensive to measure, so you simply don't have enough samples to make meaningful statistical conclusions about their association with genetics"

1

2

3

.

To address this issue, researchers have turned to sophisticated AI tools to bridge these data gaps.

The Perils of AI-Assisted Studies

Lu and his colleagues demonstrated that relying on these AI models without proper safeguards can introduce significant biases. Their research revealed that a common machine learning algorithm used in genome-wide association studies mistakenly linked several genetic variations to an individual's risk of developing Type 2 diabetes

1

2

3

.

"The problem is if you trust the machine learning-predicted diabetes risk as the actual risk, you would think all those genetic variations are correlated with actual diabetes even though they aren't," Lu cautions

1

2

3

.

These false positives are not limited to diabetes risk but represent a pervasive bias in AI-assisted studies across various health conditions.

Proposed Solutions and New Statistical Methods

To combat these issues, Lu and his team have proposed a new statistical method to enhance the reliability of AI-assisted genome-wide association studies. This method aims to remove biases introduced by machine learning algorithms when making inferences based on incomplete information

1

2

3

.

The researchers successfully applied this "statistically optimal" strategy to better identify genetic associations with individuals' bone mineral density

1

2

3

.

Beyond AI: Problems with Proxy Information

In a separate study also published in Nature Genetics, the UW-Madison team identified problems with studies that use proxy information to fill data gaps. For instance, some researchers use family health history surveys to gather data on Alzheimer's disease, as large health databases often lack information on late-onset conditions

1

2

3

.

The team found that such proxy-information studies can produce "highly misleading genetic correlation" between Alzheimer's risk and higher cognitive abilities

1

2

3

.

The Importance of Statistical Rigor

Lu emphasizes the critical need for statistical rigor in large-scale genomic research:

"These days, genomic scientists routinely work with biobank datasets that have hundreds of thousands of individuals; however, as statistical power goes up, biases and the probability of errors are also amplified in these massive datasets"

1

2

3

.

The researchers' work serves as a cautionary tale, highlighting the importance of maintaining statistical integrity in the era of big data and AI-assisted genomic studies.

TheOutpost.ai

Your Daily Dose of Curated AI News

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

© 2025 Triveous Technologies Private Limited
Instagram logo
LinkedIn logo