3 Sources
3 Sources
[1]
Order from disordered proteins: Physics-based algorithm designs biomolecules with custom properties
In synthetic and structural biology, advances in artificial intelligence have led to an explosion of designing new proteins with specific functions, from antibodies to blood clotting agents, by using computers to accurately predict the 3D structure of any given amino acid sequence. But the structure of close to 30% of all proteins expressed by the human genome are challenging to predict for even the most powerful AI tools, including the Nobel-winning AlphaFold. Never settling into a fixed shape but constantly shifting around, these so-called intrinsically disordered proteins are key to countless biological functions like cross-linking molecules, sensing, or signaling, but their inherent instability makes them difficult to design from scratch. A team at the Harvard John A. Paulson School of Engineering and Applied Sciences (SEAS) and Northwestern University have demonstrated a new machine learning method that can design intrinsically disordered proteins with tailored properties. The work opens doors to new understanding of these mysterious biomolecules and possible new insights into the origins of and treatments for diseases. The work is published in Nature Computational Science and was co-led by SEAS graduate student Ryan Krueger and former NSF-Simons QuantBio Fellow Krishna Shrinivas, now an assistant professor at Northwestern, in collaboration with Michael Brenner, the Catalyst Professor of Applied Mathematics and Applied Physics at SEAS. Shrinivas said he became interested in studying intrinsically disordered proteins because they are out of reach of current AI-based methods, such as Google DeepMind's AlphaFold, for predicting and designing proteins with distinct conformations. Yet, such disordered proteins are important to many fundamental aspects of biology, and it is known that mutations to these proteins are linked to diseases like cancer and neurodegeneration. One example of a disordered protein is alpha-synuclein, long implicated in Parkinson's and other diseases. To design IDPs for synthetic or therapeutic uses, Shrinivas said, "we needed to either come up with better AI models, or, we needed to come up with a way to actually take those physics models where you not only get good predictions, but you also get the physics for free." Automatic differentiation algorithms The paper describes a computational method powered by algorithms that can perform "automatic differentiation," or automatic computation of derivatives -- instantaneous rates of change -- in order to rationally select for protein sequences with desired behaviors or properties. The technique is a widely used tool for deep learning and training neural networks, but Brenner and his lab were among the first to recognize other potential use cases, such as optimizing physics-based molecular dynamics simulations. With automatic differentiation, the researchers were able to train a computer to recognize how small changes in protein sequences -- even single amino acid changes -- affect the final desired properties of proteins. They likened their method to a very powerful search engine for amino acid sequences that fit the criteria needed to perform a function -- say, one that creates loops or connectors, or can sense different things in the environment. "We didn't want to have to take a bunch of data and train a machine learning model to design proteins," Krueger said. "We wanted to leverage existing, sufficiently accurate simulations to be able to design proteins at the level of those simulations." The method leverages a traditional framework for training neural networks called gradient-based optimization to identify new protein sequences with efficiency and precision. The result is that the proteins the researchers designed are "differentiable," that is, they are not best-guesses predicted by AI, but rather based in molecular dynamics simulations using real physics, that take into account how proteins actually, dynamically behave in nature.
[2]
Researchers design intrinsically disordered proteins with tailored properties
Harvard John A. Paulson School of Engineering and Applied SciencesOct 7 2025 In synthetic and structural biology, advances in artificial intelligence have led to an explosion of designing new proteins with specific functions, from antibodies to blood clotting agents, by using computers to accurately predict the 3D structure of any given amino acid sequence. But the structure of close to 30% of all proteins expressed by the human genome are challenging to predict for even the most powerful AI tools, including the Nobel-winning AlphaFold. Never settling into a fixed shape but constantly shifting around, these so-called intrinsically disordered proteins are key to countless biological functions like cross-linking molecules, sensing, or signaling, but their inherent instability makes them difficult to design from scratch. A team at the Harvard John A. Paulson School of Engineering and Applied Sciences (SEAS) and Northwestern University have demonstrated a new machine learning method that can design intrinsically disordered proteins with tailored properties. The work opens doors to new understanding of these mysterious biomolecules and possible new insights into origins of and treatments for diseases. The work is published in Nature Computational Science and was co-led by SEAS graduate student Ryan Krueger and former NSF-Simons QuantBio Fellow Krishna Shrinivas, now an assistant professor at Northwestern, in collaboration with Michael Brenner, the Catalyst Professor of Applied Mathematics and Applied Physics at SEAS. Shrinivas said he became interested in studying intrinsically disordered proteins because they are out of reach of current AI-based methods, such as Google DeepMind's AlphaFold, for predicting and designing proteins with distinct conformations. Yet, such disordered proteins are important to many fundamental aspects of biology, and it is known that mutations to these proteins are linked to diseases like cancer and neurodegeneration. One example of a disordered protein is alpha-synuclein, long implicated in Parkinson's and other diseases. To design IDPs for synthetic or therapeutics uses, Shrinivas said, "we needed to either come up with better AI models, or, we needed to come up with a way to actually take those physics models where you not only get good predictions, but you also get the physics for free." Automatic differentiation algorithms The paper describes a computational method powered by algorithms that can perform "automatic differentiation," or automatic computation of derivatives - instantaneous rates of change - in order to rationally select for protein sequences with desired behaviors or properties. The technique is a widely used tool for deep learning and training neural networks, but Brenner and his lab were among the first to recognize other potential use cases, such as optimizing physics-based molecular dynamics simulations. With automatic differentiation, the researchers were able to train a computer to recognize how small changes in protein sequences - even single amino acid changes - affect the final desired properties of proteins. They likened their method to a very powerful search engine for amino acid sequences that fit the criteria needed to perform a function - say, one that creates loops or connectors, or can sense different things in the environment. We didn't want to have to take a bunch of data and train a machine learning model to design proteins. We wanted to leverage existing, sufficiently accurate simulations to be able to design proteins at the level of those simulations." Ryan Krueger, Graduate Student, Harvard John A. Paulson School of Engineering and Applied Sciences The method leverages a traditional framework for training neural networks called gradient-based optimization to identify new protein sequences with efficiency and precision. The result is that the proteins the researchers designed are "differentiable," that is, they are not best-guesses predicted by AI, but rather based in molecular dynamics simulations, using real physics, that take into account how proteins actually, dynamically behave in nature. The research received federal support from the National Science Foundation AI Institute of Dynamic Systems, the Office of Naval Research, the Harvard Materials Research Science and Engineering Center, and the NSF-Simons Center for Mathematical and Statistical Analysis of Biology at Harvard. Harvard John A. Paulson School of Engineering and Applied Sciences Journal reference: Krueger, R. K., et al. (2025) Generalized design of sequence-ensemble-function relationships for intrinsically disordered proteins. Nature Computational Science. doi.org/10.1038/s43588-025-00881-y.
[3]
AI Learns to Design the Human Body's Most Elusive Proteins - Neuroscience News
Summary: A new machine learning method has achieved what even AlphaFold cannot -- the design of intrinsically disordered proteins (IDPs), the shape-shifting biomolecules that make up nearly 30% of all human proteins. These unstable proteins play key roles in cellular communication, sensing, and disease, yet their ever-changing structures have defied traditional AI prediction models. Using automatic differentiation and physics-based simulations, scientists created an algorithm that can fine-tune amino acid sequences for specific functions. This breakthrough could transform synthetic biology, drug discovery, and our understanding of disorders like Parkinson's and cancer. In synthetic and structural biology, advances in artificial intelligence have led to an explosion of designing new proteins with specific functions, from antibodies to blood clotting agents, by using computers to accurately predict the 3D structure of any given amino acid sequence. But the structure of close to 30% of all proteins expressed by the human genome are challenging to predict for even the most powerful AI tools, including the Nobel-winning AlphaFold. Never settling into a fixed shape but constantly shifting around, these so-called intrinsically disordered proteins are key to countless biological functions like cross-linking molecules, sensing, or signaling, but their inherent instability makes them difficult to design from scratch. A team at the Harvard John A. Paulson School of Engineering and Applied Sciences (SEAS) and Northwestern University have demonstrated a new machine learning method that can design intrinsically disordered proteins with tailored properties. The work opens doors to new understanding of these mysterious biomolecules and possible new insights into origins of and treatments for diseases. The work is published in Nature Computational Science and was co-led by SEAS graduate student Ryan Krueger and former NSF-Simons QuantBio Fellow Krishna Shrinivas, now an assistant professor at Northwestern, in collaboration with Michael Brenner, the Catalyst Professor of Applied Mathematics and Applied Physics at SEAS. Shrinivas said he became interested in studying intrinsically disordered proteins because they are out of reach of current AI-based methods, such as Google DeepMind's AlphaFold, for predicting and designing proteins with distinct conformations. Yet, such disordered proteins are important to many fundamental aspects of biology, and it is known that mutations to these proteins are linked to diseases like cancer and neurodegeneration. One example of a disordered protein is alpha-synuclein, long implicated in Parkinson's and other diseases. To design IDPs for synthetic or therapeutics uses, Shrinivas said, "we needed to either come up with better AI models, or, we needed to come up with a way to actually take those physics models where you not only get good predictions, but you also get the physics for free." Automatic differentiation algorithms The paper describes a computational method powered by algorithms that can perform "automatic differentiation," or automatic computation of derivatives - instantaneous rates of change - in order to rationally select for protein sequences with desired behaviors or properties. The technique is a widely used tool for deep learning and training neural networks, but Brenner and his lab were among the first to recognize other potential use cases, such as optimizing physics-based molecular dynamics simulations. With automatic differentiation, the researchers were able to train a computer to recognize how small changes in protein sequences - even single amino acid changes - affect the final desired properties of proteins. They likened their method to a very powerful search engine for amino acid sequences that fit the criteria needed to perform a function - say, one that creates loops or connectors, or can sense different things in the environment."We didn't want to have to take a bunch of data and train a machine learning model to design proteins," Krueger said. "We wanted to leverage existing, sufficiently accurate simulations to be able to design proteins at the level of those simulations." The method leverages a traditional framework for training neural networks called gradient-based optimization to identify new protein sequences with efficiency and precision. The result is that the proteins the researchers designed are "differentiable," that is, they are not best-guesses predicted by AI, but rather based in molecular dynamics simulations, using real physics, that take into account how proteins actually, dynamically behave in nature. Funding: The research received federal support from the National Science Foundation AI Institute of Dynamic Systems, the Office of Naval Research, the Harvard Materials Research Science and Engineering Center, and the NSF-Simons Center for Mathematical and Statistical Analysis of Biology at Harvard. Author: Anne Manning Source: Harvard Contact: Anne Manning - Harvard Image: The image is credited to Neuroscience News Original Research: Closed access. "Generalized design of sequence-ensemble-function relationships for intrinsically disordered proteins" by Ryan Krueger et al. Nature Computational Science Abstract Generalized design of sequence-ensemble-function relationships for intrinsically disordered proteins The design of folded proteins has advanced substantially in recent years. However, many proteins and protein regions are intrinsically disordered and lack a stable fold, that is, the sequence of an intrinsically disordered protein (IDP) encodes a vast ensemble of spatial conformations that specify its biological function. This conformational plasticity and heterogeneity makes IDP design challenging. Here we introduce a computational framework for de novo design of IDPs through rational and efficient inversion of molecular simulations that approximate the underlying sequence-ensemble relationship. We highlight the versatility of this approach by designing IDPs with diverse properties and arbitrary sequence constraints. These include IDPs with target ensemble dimensions, loops and linkers, highly sensitive sensors of physicochemical stimuli, and binders to target disordered substrates with distinct conformational biases. Overall, our method provides a general framework for designing sequence-ensemble-function relationships of biological macromolecules.
Share
Share
Copy Link
Researchers develop a novel machine learning method to design intrinsically disordered proteins, overcoming limitations of current AI tools like AlphaFold. This breakthrough could revolutionize synthetic biology and disease research.
Researchers from Harvard John A. Paulson School of Engineering and Applied Sciences (SEAS) and Northwestern University have made a significant breakthrough in the field of protein design. They have developed a new machine learning method capable of designing intrinsically disordered proteins (IDPs) with tailored properties, a feat that has eluded even the most advanced AI tools like AlphaFold
1
2
.IDPs, which make up nearly 30% of all proteins expressed by the human genome, have been a persistent challenge in the field of structural biology. Unlike traditional proteins with fixed 3D structures, IDPs are constantly shifting and never settle into a fixed shape. This inherent instability makes them difficult to design from scratch, despite their crucial roles in biological functions such as cross-linking molecules, sensing, and signaling
1
.Source: Phys.org
The research team, led by SEAS graduate student Ryan Krueger and former NSF-Simons QuantBio Fellow Krishna Shrinivas, developed a computational method powered by algorithms that perform "automatic differentiation." This technique allows for the automatic computation of derivatives, enabling the rational selection of protein sequences with desired behaviors or properties
2
.Source: News-Medical
Unlike traditional AI-based methods that rely on best-guess predictions, the new approach leverages existing, accurate simulations to design proteins. The researchers used molecular dynamics simulations based on real physics, taking into account how proteins actually behave dynamically in nature. This results in "differentiable" proteins that more accurately reflect their natural counterparts
3
.Related Stories
The ability to design IDPs with specific properties opens up new possibilities in various fields:
1
.2
.3
.Source: Neuroscience News
This breakthrough could transform our understanding of these mysterious biomolecules and potentially lead to new treatments for various diseases. The research, published in Nature Computational Science, represents a significant step forward in the field of protein design and structural biology
2
.Summarized by
Navi
[1]
[3]
22 Nov 2024•Science and Research
13 Aug 2025•Science and Research
31 Mar 2025•Science and Research