

2 Sources
2 Sources
[1]

Cracking the genome's switchboard: How AI helps decode gene regulation
Understanding human biology requires more than mapping our genes -- we must also understand how gene expression is regulated to guide healthy development, growth, and maintenance of our body systems over a lifetime. Scientists at Lawrence Berkeley National Laboratory (Berkeley Lab) and Stanford University have revealed new insights into how regulatory sequences called enhancers drive gene expression during embryonic development. Enhancers are sections of DNA that orchestrate the expression of a gene despite being located far away from the actual coding sequence. Their work, published today in Nature, shows how multiple short, modular sequences within an enhancer are needed to properly guide expression, and that even a single nucleotide mutation in one of these regions can change how and where a gene is activated. The team also used their experimental results to develop and assess a machine learning model. "These findings show that even the smallest changes can have huge impacts," said first author Michael Kosicki, a postdoctoral researcher at Berkeley Lab. "Our approach will give scientists a powerful tool to investigate normal gene regulation and unravel the increasingly strong link between disease and variations in the non-coding genome." The Berkeley Lab team members used a mouse model to study seven human enhancers known to govern the development of the brain, heart, limbs, and face. They created a huge variety of different mutations in those enhancers, then looked for changes to developing tissues across the whole body. In one striking example, alterations to an enhancer associated with building structures in the face and limbs caused it to activate in the heart and nervous system tissues instead. "Seeing that just a single base pair mutation can change where in the body an enhancer activates, and potentially change how the body develops, has profound implications for studying human disorders and designing gene therapies," said co-lead author Len Pennacchio, a senior scientist in Berkeley Lab's Environmental Genomics and Systems Biology (EGSB) division. "It also means scientists need to be cautious when designing tissue-specific gene therapies, to avoid unintended effects." Investigating enhancers has always been challenging, because each of these sequences contains multiple binding sites for transcription factors, the molecules that switch DNA transcription on or off. The effects of mutations depend on the specific combination and location of sites that are altered, and can only be revealed through systematic experiments. This complexity, and the lack of sufficient data to train machine learning algorithms, makes it difficult to build accurate predictive models. Using the large experimental dataset the Berkeley Lab team created, the Stanford collaborators developed a new model and tested whether it could identify the same important sequences revealed in the experiments. "We wanted to explore how far AI can take us in understanding enhancer biology right now," said Axel Visel, senior scientist at Berkeley Lab and a co-corresponding author. They found that although the model could identify many functionally important regions of enhancers by searching for sequence patterns known to indicate binding sites, it missed other sequences that are clearly critical based on the team's experimental evidence. "Currently available models tell the truth, but not the whole truth," said Kosicki. "In other words, the predictions we have are typically correct, but they sometimes miss functional regions we identified experimentally. Identifying these modeling blind spots will help us improve them in the future." For now, the findings serve as both a resource and a reminder: Even the best predictive models need grounding in experimental biology. As researchers continue to refine AI tools, studies like this one will be essential for revealing what those models get right and where they still fall short.
[2]

Cracking the Genome's Switchboard: How AI Helps Decode Gene Regulation
Newswise -- Understanding human biology requires more than mapping our genes - we must also understand how gene expression is regulated to guide healthy development, growth, and maintenance of our body systems over a lifetime. Scientists at Lawrence Berkeley National Laboratory (Berkeley Lab) and Stanford University have revealed new insights into how regulatory sequences called enhancers drive gene expression during embryonic development. Enhancers are sections of DNA that orchestrate the expression of a gene despite being located far away from the actual coding sequence. Their work, now published in Nature, shows how multiple short, modular sequences within an enhancer are needed to properly guide expression, and that even a single nucleotide mutation in one of these regions can change how and where a gene is activated. The team also used their experimental results to develop and assess a machine learning model. "These findings show that even the smallest changes can have huge impacts," said first author Michael Kosicki, a postdoctoral researcher at Berkeley Lab. "Our approach will give scientists a powerful tool to investigate normal gene regulation and unravel the increasingly strong link between disease and variations in the non-coding genome." The Berkeley Lab team members used a mouse model to study seven human enhancers known to govern development of the brain, heart, limbs, and face. They created a huge variety of different mutations in those enhancers, then looked for changes to developing tissues across the whole body. In one striking example, alterations to an enhancer associated with building structures in the face and limbs caused it to activate in the heart and nervous system tissue instead. "Seeing that just a single basepair mutation can change where in the body an enhancer activates, and potentially change how the body develops, has profound implications for studying human disorders and designing gene therapies," said co-lead author Len Pennacchio, a senior scientist in Berkeley Lab's Environmental Genomics and Systems Biology (EGSB) division. "It also means scientists need to be cautious when designing tissue-specific gene therapies, to avoid unintended effects." Investigating enhancers has always been challenging, because each of these sequences contain multiple binding sites for transcription factors, the molecules that switch DNA transcription on or off. The effects of mutations depend on the specific combination and location of sites that are altered, and can only be revealed through systematic experiments. This complexity, and the lack of sufficient data to train machine learning algorithms, makes it difficult to build accurate predictive models. Using the large experimental dataset the Berkeley Lab team created, the Stanford collaborators developed a new model and tested whether it could identify the same important sequences revealed in the experiments. "We wanted to explore how far AI can take us in understanding enhancer biology right now," said Axel Visel, senior scientist at Berkeley Lab and a co-corresponding author. They found that although the model could identify many functionally important regions of enhancers by searching for sequence patterns known to indicate binding sites, it missed other sequences that are clearly critical based on the team's experimental evidence. "Currently available models tell the truth, but not the whole truth," said Kosicki. "In other words, the predictions we have are typically correct, but they sometimes miss functional regions we identified experimentally. Identifying these modeling blind spots will help us improve them in the future." For now, the findings serve as both a resource and a reminder: Even the best predictive models need grounding in experimental biology. As researchers continue to refine AI tools, studies like this one will be essential for revealing what those models get right and where they still fall short. This research was supported by the National Institutes of Health. Lawrence Berkeley National Laboratory (Berkeley Lab) is committed to groundbreaking research focused on discovery science and solutions for abundant and reliable energy supplies. The lab's expertise spans materials, chemistry, physics, biology, earth and environmental science, mathematics, and computing. Researchers from around the world rely on the lab's world-class scientific facilities for their own pioneering research. Founded in 1931 on the belief that the biggest problems are best addressed by teams, Berkeley Lab and its scientists have been recognized with 16 Nobel Prizes. Berkeley Lab is a multiprogram national laboratory managed by the University of California for the U.S. Department of Energy's Office of Science. DOE's Office of Science is the single largest supporter of basic research in the physical sciences in the United States, and is working to address some of the most pressing challenges of our time. For more information, please visit energy.gov/science.
Share
Share
Copy Link
Scientists at Berkeley Lab and Stanford University use AI to decode gene regulation, revealing how small mutations in enhancer sequences can dramatically alter gene expression and embryonic development.
Scientists at Lawrence Berkeley National Laboratory (Berkeley Lab) and Stanford University have made significant strides in understanding gene regulation, particularly focusing on enhancers - crucial DNA sequences that orchestrate gene expression. Their findings, published in Nature, reveal how even minor changes in these regulatory sequences can have profound impacts on embryonic development and gene activation patterns
1
2
.Enhancers are sections of DNA located far from the genes they regulate, yet play a vital role in controlling gene expression. The research team studied seven human enhancers known to govern the development of the brain, heart, limbs, and face using a mouse model. They created numerous mutations in these enhancers and observed the resulting changes in developing tissues across the entire body
1
.One of the most striking discoveries was that a single base pair mutation in an enhancer could dramatically alter its activation pattern. For instance, modifications to an enhancer associated with facial and limb structure development caused it to activate in heart and nervous system tissues instead. This finding has significant implications for studying human disorders and designing gene therapies
2
.
Source: Phys.org
The research team developed and assessed a machine learning model using their extensive experimental dataset. While the AI model successfully identified many functionally important regions of enhancers by recognizing sequence patterns indicative of binding sites, it also missed some critical sequences that were experimentally proven to be important
1
2
.Michael Kosicki, the first author of the study, noted that "Currently available models tell the truth, but not the whole truth." This highlights a crucial aspect of AI in genomic research - while the predictions are typically correct, they may overlook functional regions identified through experimental methods. Identifying these "modeling blind spots" will be key to improving future AI models in this field
1
.Related Stories
The study's findings have significant implications for both studying human disorders and designing gene therapies. Len Pennacchio, a senior scientist at Berkeley Lab, emphasized the need for caution when designing tissue-specific gene therapies to avoid unintended effects, given that even small mutations can lead to substantial changes in gene expression patterns
2
.Investigating enhancers has always been challenging due to their complex nature. Each enhancer contains multiple binding sites for transcription factors, and the effects of mutations depend on the specific combination and location of altered sites. This complexity, coupled with the lack of sufficient data to train machine learning algorithms, has made it difficult to build accurate predictive models
1
2
.While the current AI models show promise, the study serves as a reminder that even the best predictive models need grounding in experimental biology. As researchers continue to refine AI tools, studies like this will be essential for revealing what these models get right and where they still fall short. The findings from this research will serve as both a resource and a guide for future developments in the field of genomics and AI-assisted biological research
1
2
.Summarized by

Navi
24 Oct 2024•Science and Research

09 Jan 2025•Science and Research

09 May 2025•Science and Research
