6 Sources
[1]
DeepMind's new AlphaGenome AI tackles the 'dark matter' in our DNA
Nearly 25 years after scientists completed a draft human genome sequence, many of its 3.1 billion letters remain a puzzle. The 98% of the genome that is not made of protein-coding genes -- but which can influence their activity -- is especially vexing. An artificial intelligence (AI) model developed by Google DeepMind in London could help scientists to make sense of this 'dark matter', and see how it might contribute to diseases such as cancer and influence the inner workings of cells. The model, called AlphaGenome, is described in a 25 June preprint. "This is one of the most fundamental problems not just in biology -- in all of science," Pushmeet Kohli, the company's head of AI for science said at a press briefing. The 'sequence to function' model takes long stretches of DNA and predicts various properties, such as the expression levels of the genes they contain and how those levels could be affected by mutations. "I think it is an exciting leap forward," says Anshul Kundaje, a computational genomicist at Stanford University in Palo Alto, California, who has had early access to AlphaGenome. "It is a genuine improvement in pretty much all current state-of-the-art sequence-to-function models." When DeepMind unveiled AlphaFold 2 in 2020, it went a long way to solving a problem that had challenged researchers for decades: determining how a protein's sequence contributes to its three-dimensional shape. Working out what DNA sequences do is different, because there is no one answer, as in a 3D structure that AlphaFold delivers. A single DNA stretch will have numerous, interconnected roles -- from attracting one set of cellular machinery to latch onto a particular section of a chromosome and turn a nearby gene into an RNA molecule, to attracting protein-transcription factors that influence where, when and to what extent gene expression occurs. Many DNA sequences, for example, influence gene activity by altering a chromosome's 3D shape, either restricting or easing access for the machinery that does the transcription. Biologists have been chipping away at this question for decades with various kinds of computational tools. In the last decade or so, scientists have developed dozens of AI models to make sense of the genome. Many of these have focused on an individual task, such as predicting levels of gene expression or determining how modular segments of individual genes, called exons, are cut-and-pasted into distinct proteins. But scientists are increasingly interested in 'all in one' tools for interpreting DNA sequences. AlphaGenome is one such model. It can take inputs of up to one million DNA letters -- a stretch that could include a gene and myriad regulatory elements -- and make thousands of predictions about numerous biological properties. In many cases, AlphaGenome's predictions are sensitive to single-DNA-letter changes, which means that scientists can predict the consequences of mutations. In one example, DeepMind researchers applied the AlphaGenome model to diverse mutations identified in previous studies in people with a type of leukaemia. The model accurately predicted that the non-coding mutations indirectly activated a nearby gene that is a common driver of this cancer. AlphaGenome was trained on genomic and other experimental data from humans and mice only. It might work as well on related organisms, but the researchers didn't test this, said Žiga Avsec, a DeepMind scientist, at the briefing. Neither was the model designed to reliably interpret an individual's genome, or to provide a full picture of how variants influence complex diseases. There is room for improvement in the accuracy of the AlphaGenome's predictions. For instance, the model struggles to identify sequences that alter the expression of a gene located more than 100,000 base pairs away. "This model has not yet 'solved' gene regulation to the same extent as AlphaFold has, for example, protein 3D structure prediction," adds Kundaje. One thing that AlphaGenome -- and similar models -- don't yet capture is how a cell's changing nature can affect how DNA sequences function, says Peter Koo, a computational biologist at Cold Spring Harbor Laboratory in New York. These models are trained to make predictions in one fixed setting, but cells are dynamic: protein levels, chemical tags on DNA, and other conditions can shift over time or between cell types -- and that can change how the same sequence behaves. Koo predicts that researchers will build on AlphaGenome by using the model to design 'regulatory' DNA sequences that allow control over when and where a gene is active, for instance, or to run virtual experiments that simulate how cells respond to genetic changes. For now, researchers doing non-commercial work can access the model through DeepMind's servers using a programming interface. A fuller release -- that would enable more-sophisticated applications -- is planned for the future.
[2]
DeepMind's latest AI tool makes sense of changes in the human genome
The human genome offers a complete instruction manual for building a person, but it's a tough read. How does a given letter among billions of DNA bases affect how the body functions? Now, DeepMind, the Google spinoff behind the artificial intelligence (AI) model AlphaFold that mastered how proteins fold into their 3D shapes, has tackled that broader challenge. Its new AI tool, AlphaGenome, reveals how simple genetic changes affect the expression of genes, altering the kinds and amounts of RNA and proteins they produce. AlphaGenome, described today in a preprint, is expected to make it easier for researchers to pinpoint the causes of diseases by more accurately tracking the consequences of genetic mutations, and help synthetic biologists design new genes from scratch. DeepMind officials say they are still working out how they will make the tool commercially available. But academic researchers can use it for free. "This is going to be an extremely useful tool," says Caleb Lareau, a systems biologist at Memorial Sloan Kettering Cancer Center who was given early access to the AI. "This is the most comprehensive attempt to annotate and explain every possible change of the 3-billion-letter sequence in the human genome. It's the strongest in silico tool we've had to date." DeepMind's latest AI builds on AlphaFold's previous success at predicting protein folding, which earned a pair of company researchers a share of last year's Nobel Prize in Chemistry. However, understanding how changes in DNA affect an organism broadly "is more of a fuzzy field," says Natasha Latysheva, a DeepMind genome researcher. A single genetic mutation can have complex, cascading effects on gene expression and how much of a protein is produced. AI developers have been tackling this complexity piece by piece. They've designed individual algorithms to search swaths of DNA for likely protein coding regions, recognize genetic patterns associated with disease, interpret genetic variations, predict disease risk, and help tailor clinical treatments. AlphaGenome unites many of these analyses and others into a single package. It relies on massive molecular biology databases produced over decades by publicly funded consortia. These include the results of experiments tracking how certain mutations in human and mouse cells affect properties such as the production of RNA, which translates the genome's blueprints into proteins, and levels of transcription factors, proteins that can turn genes on and off. By training on those data sets, AlphaGenome has learned to decipher DNA, identifying both genes and the nongene sequences that orchestrate gene activity, along with the genetic variants most likely to produce consequential changes. To use the new AI, researchers feed it a DNA sequence up to 1 million base pairs long. The model then predicts the locations where genes start and end, which can vary among different cell types. It also captures the intricacies of RNA processing -- which forms of RNA are created from those genes, and how much RNA they produce. AlphaGenome can then predict how altering a single letter of DNA affects the expression of genes and alters their RNA and protein products. Whereas other AIs can do some of this analysis for the estimated 2% of the genome in protein-coding genes, AlphaGenome is the first to manage the same feat for the full genome. "For the first time, an AI model can predict exactly where and how an RNA [variant] is expressed directly from a sequence of DNA," says Hani Goodarzi, a genomics AI model builder at the University of California San Francisco. "This allows us to see not just if a gene is expressed, but how the resulting RNA will be processed." The DeepMind team reports that AlphaGenome outperformed 22 of 24 other computer models at identifying specific features in single DNA sequences, such as coding and noncoding regions as well as transcription factor binding sites, and bested 24 of 26 models on predicting the effect of a variant on gene regulation. Researchers expect the new AI to help them pinpoint consequential genetic changes. Marc Mansour, a cancer molecular biologist at University College London, says when his lab compares the genomes of cells from patients' cancerous tissues with their unaffected cells, thousands of individual letter changes emerge. "It's very hard to work out whether any particular change will have a functional consequence," he says. AlphaGenome, he notes, ranks the variants most likely to be consequential, allowing researchers to focus their follow-up studies. That ranking capability "is hugely important for my research," adds Lareau, whose lab analyzes the effect of genetic changes on immune function. "Instead of testing hundreds of things, I can focus on a couple, having been guided to the right spot." The power to predict how genetic changes affect gene expression should be equally valuable to synthetic biologists, Latysheva adds. The AI could suggest whether newly devised genetic sequences would have beneficial effects, before further testing those effects in lab experiments. AlphaGenome's developers plan to release its underlying source code and model weights, which determine how a model generates an output, when a peer-reviewed version of the paper is published, enabling researchers to customize it for their own projects. Asked whether that could make it easier for bad actors to design bioweapons, Pushmeet Kohli, DeepMind's vice president of research, said the company shared the model with outside biosecurity experts. "We got feedback that it is quite safe, and that in releasing it, the benefits far outweigh the risks." Kohli added that DeepMind hopes to continue to expand AlphaGenome's capabilities, such as providing better insight into how genetic variations lead to complex traits or diseases. Kohli says: "What we have today is like AlphaFold 1 -- a big first step."
[3]
Google's new AI will help researchers understand how our genes work
Now Google's DeepMind division says it's made a leap in trying to understand the code with AlphaGenome, an AI model that predicts what effects small changes in DNA will have on an array of molecular processes, such as whether a gene's activity will go up or down. It's just the sort of question biologists regularly assess in lab experiments. "We have, for the first time, created a single model that unifies many different challenges that come with understanding the genome," says Pushmeet Kohli, a vice president for research at DeepMind. Five years ago, the Google AI division released AlphaFold, a technology for predicting the 3D shape of proteins. That work was honored with a Nobel Prize last year and spawned a drug-discovery spinout, Isomorphic Labs, and a boom of companies that hope AI will be able to propose new drugs. AlphaGenome is an attempt to further smooth biologists' work by answering basic questions about how changing DNA letters alters gene activity and, eventually, how genetic mutations affect our health. "We have these 3 billion letters of DNA that make up a human genome, but every person is slightly different, and we don't fully understand what those differences do," says Caleb Lareau, a computational biologist at Memorial Sloan Kettering Cancer Center who has had early access to AlphaGenome. "This is the most powerful tool to date to model that." Google says AlphaGenome will be free for noncommercial users and plans to release full details of the model in the future. According to Kohli, the company is exploring ways to "enable use of this model by commercial entities" such as biotech companies.
[4]
Google unveils new AI model to decode one million DNA letters at once
It brings base-resolution insight to long-range genomic analysis, decoding the impact of mutations with speed, scale, and unprecedented depth. The model processes up to 1 million base pairs in a single pass and predicts thousands of molecular properties, including gene expression, splicing patterns, protein-binding sites, and chromatin accessibility across diverse cell types. It's the first time such a wide range of regulatory features can be modeled jointly using one AI system. AlphaGenome's architecture first uses convolutional layers to spot short patterns in the DNA sequence, then applies transformers to share information across the entire stretch of genetic code. A final set of layers converts these learned patterns into predictions across various genomic features. During training, all computations for a single sequence are distributed across multiple interconnected Tensor Processing Units (TPUs), enabling efficient large-scale processing. A single model was trained in just four hours, using half the compute budget required for its predecessor, Enformer.
[5]
AlphaGenome: AI for better understanding the genome
Introducing a new, unifying DNA sequence model that advances regulatory variant-effect prediction and promises to shed new light on genome function -- now available via API. The genome is our cellular instruction manual. It's the complete set of DNA which guides nearly every part of a living organism, from appearance and function to growth and reproduction. Small variations in a genome's DNA sequence can alter an organism's response to its environment or its susceptibility to disease. But deciphering how the genome's instructions are read at the molecular level -- and what happens when a small DNA variation occurs -- is still one of biology's greatest mysteries. Today, we introduce AlphaGenome, a new artificial intelligence (AI) tool that more comprehensively and accurately predicts how single variants or mutations in human DNA sequences impact a wide range of biological processes regulating genes. This was enabled, among other factors, by technical advances allowing the model to process long DNA sequences and output high-resolution predictions. To advance scientific research, we're making AlphaGenome available in preview via our AlphaGenome API for non-commercial research, and planning to release the model in the future. We believe AlphaGenome can be a valuable resource for the scientific community, helping scientists better understand genome function, disease biology, and ultimately, drive new biological discoveries and the development of new treatments. Our AlphaGenome model takes a long DNA sequence as input -- up to 1 million letters, also known as base-pairs -- and predicts thousands of molecular properties characterising its regulatory activity. It can also score the effects of genetic variants or mutations by comparing predictions of mutated sequences with unmutated ones. Predicted properties include where genes start and where they end in different cell types and tissues, where they get spliced, the amount of RNA being produced, and also which DNA bases are accessible, close to one another, or bound by certain proteins. Training data was sourced from large public consortia including ENCODE, GTEx, 4D Nucleome and FANTOM5, which experimentally measured these properties covering important modalities of gene regulation across hundreds of human and mouse cell types and tissues.
[6]
DeepMind launches AlphaGenome to predict how DNA mutations affect genes - SiliconANGLE
DeepMind launches AlphaGenome to predict how DNA mutations affect genes Alphabet Inc.'s Google DeepMind today introduced AlphaGenome, a new artificial intelligence tool that can comprehensively predict how mutations or variants in human DNA sequences impact gene regulation. The genome is the complete set of deoxyribonucleic acid, or DNA, within a living cell, which includes all the genetic information necessary for development, growth, and functioning. In humans, the genome consists of 23 pairs of chromosomes located in the nucleus of the cell, and it regulates everything, including the response to the environment and susceptibility to disease. The new AlphaGenome model can take an extremely long DNA sequence as input -- up to 1 million letters, also known as base pairs -- and predict thousands of molecular properties. These are the recognizable letters of A, T, C and G. The properties that it can predict include where genes start and where they end in different cell types and tissues, where they get spliced and the number of proteins they produce. Proteins are the building blocks of tissues and enzymes, which are required to take action in the body. It can also tell how close they are to each other or if they're bound to other proteins. DeepMind trained the model on a considerable amount of scientific data from large public consortia that include information about gene regulation, including ENCODE, GTEx, 4D Nucleome and others. Not only can the AI model "see" a large number of DNA letters at once to make predictions about how the genes will behave, but it can also predict the resolution of individual letters. The long sequence length is crucial for covering regions of regulatory genes that are distant from the originating gene. "Previous models had to trade off sequence length and resolution, which limited the range of modalities they could jointly model and accurately predict," the DeepMind team said. This capability makes the AI model useful for predicting "splice" errors. These errors can cause rare genetic diseases like spinal muscular atrophy and some forms of cystic fibrosis. Think of DNA as the script for a training video, and ribonucleic acid, or RNA, as the raw footage. Before the final cut, the cell "edits" the RNA, removing unnecessary parts and stitching together the important scenes. But sometimes, the editing goes wrong -- key scenes are left out, or extra ones are included -- resulting in a flawed final product. These mistakes, called splice junction errors, can disrupt how the body works. According to DeepMind, AlphaGenome achieves state-of-the-art performance across a wide range of genomic prediction benchmarks, including predicting which parts of the DNA molecule will be in close proximity, whether genetic variants will increase or decrease expression of a gene or if it will change a gene-splicing pattern. "It's a milestone for the field. For the first time, we have a single model that unifies long-range context, base-level precision and state-of-the-art performance across a whole spectrum of genomic tasks," said Dr. Caleb Lareau, a researcher at Memorial Sloan Kettering Cancer Center. DeepMind said that it expects that AlphaGenome will be a powerful research tool for disease understanding by helping to accurately predict genetic disruptions. It could also be used to help guide the design of synthetic DNA with specific regulatory functions and accelerate the understanding of the genome by assisting in the understanding of its crucial functional elements.
Share
Copy Link
Google DeepMind unveils AlphaGenome, an AI model that predicts how DNA sequences affect gene expression and regulation, potentially revolutionizing genomics research and disease understanding.
Google DeepMind has unveiled AlphaGenome, a groundbreaking artificial intelligence (AI) model designed to decode the complexities of the human genome. This new tool aims to shed light on the 98% of the genome that doesn't code for proteins - often referred to as the genomic "dark matter" - and predict how changes in DNA sequences affect gene expression and regulation 1.
Source: SiliconANGLE
AlphaGenome represents a significant leap forward in genomic analysis. The model can process up to one million DNA base pairs simultaneously, making predictions about thousands of molecular properties 4. It outperforms existing models in identifying specific features in DNA sequences, such as coding and non-coding regions, as well as transcription factor binding sites 2.
Key capabilities of AlphaGenome include:
The model was trained on extensive genomic and experimental data from humans and mice, sourced from public consortia such as ENCODE, GTEx, 4D Nucleome, and FANTOM5 5.
Source: Nature
AlphaGenome's ability to predict the consequences of genetic mutations could have far-reaching implications for medical research and synthetic biology. Researchers expect the tool to help pinpoint consequential genetic changes in diseases such as cancer, allowing them to focus their studies more effectively 2.
Caleb Lareau, a systems biologist at Memorial Sloan Kettering Cancer Center, describes AlphaGenome as "the most comprehensive attempt to annotate and explain every possible change of the 3-billion-letter sequence in the human genome" 3.
While AlphaGenome represents a significant advancement, it's important to note that it hasn't yet reached the same level of problem-solving as AlphaFold did for protein structure prediction. Anshul Kundaje, a computational genomicist at Stanford University, points out that there's still room for improvement in the accuracy of AlphaGenome's predictions, particularly for long-range gene interactions 1.
DeepMind researchers acknowledge that AlphaGenome is comparable to "AlphaFold 1" - a significant first step with potential for further development. Future improvements may include better insights into how genetic variations lead to complex traits or diseases 2.
Source: MIT Technology Review
DeepMind is making AlphaGenome available for non-commercial research through an API, with plans to release the full model in the future. The company has consulted with biosecurity experts to ensure that the benefits of releasing the model outweigh potential risks 2 3.
As AlphaGenome continues to evolve, it promises to be a valuable tool for researchers, potentially accelerating our understanding of the human genome and paving the way for new biological discoveries and treatment developments.
Google introduces Gemini CLI, an open-source AI tool that brings the power of Gemini 2.5 Pro to developers' terminals, offering advanced coding assistance and versatile AI capabilities directly in the command line interface.
17 Sources
Technology
15 hrs ago
17 Sources
Technology
15 hrs ago
Nvidia's stock hits a record high, reclaiming its position as the world's most valuable company, driven by renewed optimism in AI technology and strong market performance despite geopolitical challenges.
10 Sources
Business and Economy
7 hrs ago
10 Sources
Business and Economy
7 hrs ago
A comprehensive study published in Nature highlights the growing connection between computer vision research and surveillance applications, raising ethical concerns about privacy and human rights.
6 Sources
Technology
7 hrs ago
6 Sources
Technology
7 hrs ago
A recent study reveals that 60% of U.S. K-12 public school teachers are using AI tools, with weekly users saving an average of 6 hours per week. Teachers report improved work quality and better work-life balance, while also navigating concerns about student misuse.
11 Sources
Technology
23 hrs ago
11 Sources
Technology
23 hrs ago
A bipartisan group of U.S. lawmakers has introduced the "No Adversarial AI Act" to prohibit the use of AI systems developed in China and other adversarial countries in federal agencies, highlighting the intensifying AI race between the U.S. and China.
7 Sources
Policy and Regulation
15 hrs ago
7 Sources
Policy and Regulation
15 hrs ago