Curated by THEOUTPOST
On Fri, 28 Mar, 12:05 AM UTC
3 Sources
[1]
New AI method could predict emerging SARS-CoV-2 variants
Florida Atlantic UniversityMar 27 2025 It's been five years since COVID-19 was declared a global pandemic. As SARS-CoV-2 shifts to endemic status, questions about its future evolution remain. New variants of the virus will likely emerge, driven by positive selection for traits such as increased transmissibility, longer infection duration and the ability to evade immune defenses. These changes could allow the virus to spread among previously immunized populations, potentially triggering new waves of infection. Predicting new mutations in viruses is crucial for advancing life science research, particularly when trying to understand how viruses evolve, spread and affect public health. Traditionally, researchers rely on wet-lab experiments to study mutations. However, these experiments can be costly and time-consuming. Researchers from the College of Engineering and Computer Science at Florida Atlantic University have developed a new method to predict mutations in protein sequences called Deep Novel Mutation Search (DNMS), a type of artificial intelligence model that uses deep neural networks. For the study, they focused on the SARS-CoV-2 spike protein - the part of the virus responsible for helping it enter human cells - and used a protein language model to predict potential new mutations in this protein never seen before. To do this, researchers used a language model, ProtBERT, which was specifically fine-tuned to understand the "dialect" of SARS-CoV-2 spike proteins. The model works by looking at potential mutations and ranking them based on several factors. These include grammaticality, which refers to how likely or "correct" a mutation is according to the grammatical rules learned by the model, as well as how similar the mutated sequence is to the original protein, which is measured by semantic change and attention change. Results of the study, published in the journal Communications Biology, show that the DNMS language model can separate sequences into groups based on their similarities. The model can predict which mutations are likely to occur by looking for mutations that cause only small changes in the protein's structure and function. This is important because, in most cases, viruses like SARS-CoV-2 evolve through small changes that allow them to adapt without drastically altering their overall function. The DNMS method uses all available information about the sequence and the mutations to create a more accurate prediction of which mutations are likely to occur. Unlike prior research, which typically looks at changes to a reference protein sequence, DNMS introduces a parent-child mutation prediction model. The parent sequence (an existing protein sequence) is used to predict mutations, and these mutations are analyzed based on how they might evolve over time. Our model ranks all possible mutations to find the ones that are most likely to occur in the future. Our study shows that mutations following the protein's grammars, with minimal changes compared to the original sequence and low attention differences, are considered the most likely future mutations." Xingquan "Hill" Zhu, Ph.D., senior author and professor in FAU's Department of Electrical Engineering and Computer Science The method first takes a given SARS-CoV-2 spike protein sequence and simulates all possible single-point mutations. For each mutated version of the protein, DNMS uses the ProtBERT model to calculate how likely each mutation is to follow the "grammar" of the protein (grammaticality) and how similar the mutated sequence is to the original sequence (semantic change). Additionally, the model looks at attention, a measure that has been used to study protein structure and function, but never before applied to mutation prediction. "The key to our method lies in using the context provided by the parent sequence. This context is crucial for evaluating whether a potential mutation aligns with the 'grammar' of the protein," said Zhu. "DNMS works by selecting a parent sequence from a phylogenetic tree - basically a family tree of viral strains - and simulating all possible mutations." The study also looked at the relationship between the predicted mutations and the virus' fitness, or how well it can replicate and survive. Findings show that mutations with high grammaticality, small semantic change, and low attention change were associated with higher viral fitness. This suggests that mutations which fit well within the biological "rules" of the protein and cause minimal disruption to the protein's structure or function are more likely to be beneficial for the virus. "We believe that using sequence data alone can help make these predictions, as proteins follow certain biological rules," said Zhu. The researchers tested the effectiveness of DNMS through statistical analysis. Their results show that DNMS outperforms other methods in predicting novel mutations because it combines all the relevant factors into a single, more accurate prediction model. "The fine-tuned, pre-trained language model developed by our researchers can predict which SARS-CoV-2 mutations are more likely to occur in the future," said Stella Batalama, Ph.D., dean of the College of Engineering and Computer Science. "This method can be useful for guiding experimental research, as it provides predictions about mutations before they are observed in the population, helping public health officials track and prepare for new mutations before they spread widely." Study co-author is Magdalyn E. Elkin, a doctoral student in FAU's Department of Electrical Engineering and Computer Science. The research was sponsored by the United States National Science Foundation. Florida Atlantic University Journal reference: Elkin, M. E., & Zhu, X. (2025). Paying attention to the SARS-CoV-2 dialect : a deep neural network approach to predicting novel protein mutations. Communications Biology. doi.org/10.1038/s42003-024-07262-7.
[2]
AI learns to 'speak' genetic 'dialect' for future SARS-CoV-2 mutation prediction
It's been five years since COVID-19 was declared a global pandemic. As SARS-CoV-2 shifts to endemic status, questions about its future evolution remain. New variants of the virus will likely emerge, driven by positive selection for traits such as increased transmissibility, longer infection duration and the ability to evade immune defenses. These changes could allow the virus to spread among previously immunized populations, potentially triggering new waves of infection. Predicting new mutations in viruses is crucial for advancing life science research, particularly when trying to understand how viruses evolve, spread and affect public health. Traditionally, researchers rely on wet-lab experiments to study mutations. However, these experiments can be costly and time-consuming. Researchers from the College of Engineering and Computer Science at Florida Atlantic University have developed a new method to predict mutations in protein sequences called Deep Novel Mutation Search (DNMS), a type of artificial intelligence model that uses deep neural networks. For the study, they focused on the SARS-CoV-2 spike protein -- the part of the virus responsible for helping it enter human cells -- and used a protein language model to predict potential new mutations in this protein never seen before. To do this, researchers used a language model, ProtBERT, which was specifically fine-tuned to understand the "dialect" of SARS-CoV-2 spike proteins. The model works by looking at potential mutations and ranking them based on several factors. These include grammaticality, which refers to how likely or "correct" a mutation is according to the grammatical rules learned by the model, as well as how similar the mutated sequence is to the original protein, which is measured by semantic change and attention change. Results of the study, published in the journal Communications Biology, show that the DNMS language model can separate sequences into groups based on their similarities. The model can predict which mutations are likely to occur by looking for mutations that cause only small changes in the protein's structure and function. This is important because, in most cases, viruses like SARS-CoV-2 evolve through small changes that allow them to adapt without drastically altering their overall function. The DNMS method uses all available information about the sequence and the mutations to create a more accurate prediction of which mutations are likely to occur. Unlike prior research, which typically looks at changes to a reference protein sequence, DNMS introduces a parent-child mutation prediction model. The parent sequence (an existing protein sequence) is used to predict mutations, and these mutations are analyzed based on how they might evolve over time. "Our model ranks all possible mutations to find the ones that are most likely to occur in the future," said Xingquan "Hill" Zhu, Ph.D., senior author and a professor in FAU's Department of Electrical Engineering and Computer Science. "Our study shows that mutations following the protein's grammars, with minimal changes compared to the original sequence and low attention differences, are considered the most likely future mutations." The method first takes a given SARS-CoV-2 spike protein sequence and simulates all possible single-point mutations. For each mutated version of the protein, DNMS uses the ProtBERT model to calculate how likely each mutation is to follow the "grammar" of the protein (grammaticality) and how similar the mutated sequence is to the original sequence (semantic change). Additionally, the model looks at attention, a measure that has been used to study protein structure and function, but never before applied to mutation prediction. "The key to our method lies in using the context provided by the parent sequence. This context is crucial for evaluating whether a potential mutation aligns with the 'grammar' of the protein," said Zhu. "DNMS works by selecting a parent sequence from a phylogenetic tree -- basically a family tree of viral strains -- and simulating all possible mutations." The study also looked at the relationship between the predicted mutations and the virus's fitness, or how well it can replicate and survive. Findings show that mutations with high grammaticality, small semantic change, and low attention change were associated with higher viral fitness. This suggests that mutations which fit well within the biological "rules" of the protein and cause minimal disruption to the protein's structure or function are more likely to be beneficial for the virus. "We believe that using sequence data alone can help make these predictions, as proteins follow certain biological rules," said Zhu. The researchers tested the effectiveness of DNMS through statistical analysis. Their results show that DNMS outperforms other methods in predicting novel mutations because it combines all the relevant factors into a single, more accurate prediction model. "The fine-tuned, pre-trained language model developed by our researchers can predict which SARS-CoV-2 mutations are more likely to occur in the future," said Stella Batalama, Ph.D., dean of the College of Engineering and Computer Science. "This method can be useful for guiding experimental research, as it provides predictions about mutations before they are observed in the population, helping public health officials track and prepare for new mutations before they spread widely." The study's co-author is Magdalyn E. Elkin, a doctoral student in FAU's Department of Electrical Engineering and Computer Science.
[3]
AI Learns to 'Speak' Genetic 'Dialect' for Future SARS-CoV-2 Mutation Prediction | Newswise
Newswise -- It's been five years since COVID-19 was declared a global pandemic. As SARS-CoV-2 shifts to endemic status, questions about its future evolution remain. New variants of the virus will likely emerge, driven by positive selection for traits such as increased transmissibility, longer infection duration and the ability to evade immune defenses. These changes could allow the virus to spread among previously immunized populations, potentially triggering new waves of infection. Predicting new mutations in viruses is crucial for advancing life science research, particularly when trying to understand how viruses evolve, spread and affect public health. Traditionally, researchers rely on wet-lab experiments to study mutations. However, these experiments can be costly and time-consuming. Researchers from the College of Engineering and Computer Science at Florida Atlantic University have developed a new method to predict mutations in protein sequences called Deep Novel Mutation Search (DNMS), a type of artificial intelligence model that uses deep neural networks. For the study, they focused on the SARS-CoV-2 spike protein - the part of the virus responsible for helping it enter human cells - and used a protein language model to predict potential new mutations in this protein never seen before. To do this, researchers used a language model, ProtBERT, which was specifically fine-tuned to understand the "dialect" of SARS-CoV-2 spike proteins. The model works by looking at potential mutations and ranking them based on several factors. These include grammaticality, which refers to how likely or "correct" a mutation is according to the grammatical rules learned by the model, as well as how similar the mutated sequence is to the original protein, which is measured by semantic change and attention change. Results of the study, published in the journal Communications Biology, show that the DNMS language model can separate sequences into groups based on their similarities. The model can predict which mutations are likely to occur by looking for mutations that cause only small changes in the protein's structure and function. This is important because, in most cases, viruses like SARS-CoV-2 evolve through small changes that allow them to adapt without drastically altering their overall function. The DNMS method uses all available information about the sequence and the mutations to create a more accurate prediction of which mutations are likely to occur. Unlike prior research, which typically looks at changes to a reference protein sequence, DNMS introduces a parent-child mutation prediction model. The parent sequence (an existing protein sequence) is used to predict mutations, and these mutations are analyzed based on how they might evolve over time. "Our model ranks all possible mutations to find the ones that are most likely to occur in the future," said Xingquan "Hill" Zhu, Ph.D., senior author and a professor in FAU's Department of Electrical Engineering and Computer Science. "Our study shows that mutations following the protein's grammars, with minimal changes compared to the original sequence and low attention differences, are considered the most likely future mutations." The method first takes a given SARS-CoV-2 spike protein sequence and simulates all possible single-point mutations. For each mutated version of the protein, DNMS uses the ProtBERT model to calculate how likely each mutation is to follow the "grammar" of the protein (grammaticality) and how similar the mutated sequence is to the original sequence (semantic change). Additionally, the model looks at attention, a measure that has been used to study protein structure and function, but never before applied to mutation prediction. "The key to our method lies in using the context provided by the parent sequence. This context is crucial for evaluating whether a potential mutation aligns with the 'grammar' of the protein," said Zhu. "DNMS works by selecting a parent sequence from a phylogenetic tree - basically a family tree of viral strains - and simulating all possible mutations." The study also looked at the relationship between the predicted mutations and the virus' fitness, or how well it can replicate and survive. Findings show that mutations with high grammaticality, small semantic change, and low attention change were associated with higher viral fitness. This suggests that mutations which fit well within the biological "rules" of the protein and cause minimal disruption to the protein's structure or function are more likely to be beneficial for the virus. "We believe that using sequence data alone can help make these predictions, as proteins follow certain biological rules," said Zhu. The researchers tested the effectiveness of DNMS through statistical analysis. Their results show that DNMS outperforms other methods in predicting novel mutations because it combines all the relevant factors into a single, more accurate prediction model. "The fine-tuned, pre-trained language model developed by our researchers can predict which SARS-CoV-2 mutations are more likely to occur in the future," said Stella Batalama, Ph.D., dean of the College of Engineering and Computer Science. "This method can be useful for guiding experimental research, as it provides predictions about mutations before they are observed in the population, helping public health officials track and prepare for new mutations before they spread widely." Study co-author is Magdalyn E. Elkin, a doctoral student in FAU's Department of Electrical Engineering and Computer Science. The research was sponsored by the United States National Science Foundation. - FAU - About FAU's College of Engineering and Computer Science: The FAU College of Engineering and Computer Science is internationally recognized for cutting-edge research and education in the areas of computer science and artificial intelligence (AI), computer engineering, electrical engineering, biomedical engineering, civil, environmental and geomatics engineering, mechanical engineering, and ocean engineering. Research conducted by the faculty and their teams expose students to technology innovations that push the current state-of-the art of the disciplines. The College research efforts are supported by the National Science Foundation (NSF), the National Institutes of Health (NIH), the Department of Defense (DOD), the Department of Transportation (DOT), the Department of Education (DOEd), the State of Florida, and industry. The FAU College of Engineering and Computer Science offers degrees with a modern twist that bear specializations in areas of national priority such as AI, cybersecurity, internet-of-things, transportation and supply chain management, and data science. New degree programs include Master of Science in AI (first in Florida), Master of Science and Bachelor in Data Science and Analytics, and the new Professional Master of Science and Ph.D. in computer science for working professionals. For more information about the College, please visit eng.fau.edu.
Share
Share
Copy Link
Researchers at Florida Atlantic University have developed an AI model called Deep Novel Mutation Search (DNMS) that can predict future mutations in the SARS-CoV-2 virus by understanding the 'dialect' of its spike protein.
Researchers from Florida Atlantic University's College of Engineering and Computer Science have developed an innovative artificial intelligence model capable of predicting future mutations in the SARS-CoV-2 virus. This breakthrough comes five years after COVID-19 was declared a global pandemic, as the virus transitions to endemic status 1.
The new method, called Deep Novel Mutation Search (DNMS), utilizes deep neural networks to predict mutations in protein sequences. Focusing on the SARS-CoV-2 spike protein, responsible for the virus's entry into human cells, the researchers employed a protein language model to forecast potential new mutations 2.
At the core of DNMS is ProtBERT, a language model fine-tuned to comprehend the "dialect" of SARS-CoV-2 spike proteins. This model evaluates potential mutations based on several factors:
Unlike previous research that typically examines changes to a reference protein sequence, DNMS introduces a parent-child mutation prediction model. This approach uses an existing protein sequence (parent) to predict mutations and analyzes how these mutations might evolve over time 1.
The study also explored the relationship between predicted mutations and viral fitness. Findings indicate that mutations with high grammaticality, small semantic change, and low attention change were associated with higher viral fitness. This suggests that mutations adhering to the protein's biological "rules" and causing minimal disruption are more likely to benefit the virus 2.
Dr. Xingquan "Hill" Zhu, senior author and professor at FAU's Department of Electrical Engineering and Computer Science, emphasized the model's ability to rank all possible mutations and identify those most likely to occur in the future. This capability could prove invaluable for guiding experimental research and helping public health officials prepare for new mutations before they spread widely 3.
DNMS offers several advantages over traditional wet-lab experiments, which can be costly and time-consuming. By using sequence data alone and leveraging the biological rules that proteins follow, DNMS provides a more efficient and cost-effective method for predicting viral mutations 1.
The research, sponsored by the United States National Science Foundation, represents a significant step forward in our ability to anticipate and prepare for future viral threats. As SARS-CoV-2 continues to evolve, tools like DNMS may prove crucial in staying ahead of the virus and protecting public health.
Reference
[1]
[2]
Medical Xpress - Medical and Health News
|AI learns to 'speak' genetic 'dialect' for future SARS-CoV-2 mutation predictionA new study highlights how artificial intelligence can revolutionize infectious disease research and outbreak management, emphasizing the need for ethical considerations and data accessibility.
3 Sources
3 Sources
Researchers from Florida Atlantic University have developed AI models to predict COVID-19 severity and treatment needs for hospitalized patients, potentially improving patient care and resource allocation during pandemics.
2 Sources
2 Sources
A deep learning AI model called LucaProt has identified over 160,000 new RNA virus species from global ecosystems, significantly expanding our understanding of viral diversity and potentially reshaping the study of Earth's ecosystems.
6 Sources
6 Sources
Researchers develop EVOLVEpro, an AI tool that significantly enhances protein engineering capabilities, promising advancements in medicine, agriculture, and environmental solutions.
3 Sources
3 Sources
Scientists at Columbia University have developed an AI model called GET that can accurately predict gene activity in human cells, potentially revolutionizing our understanding of cellular biology and disease mechanisms.
5 Sources
5 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved