2 Sources
2 Sources
[1]
AI-generated sensors open new paths for early cancer detection
Caption: Nanoparticles coated with AI-generated peptides can act as sensors that signal if cancer-linked proteases are present in the body. Detecting cancer in the earliest stages could dramatically reduce cancer deaths because cancers are usually easier to treat when caught early. To help achieve that goal, MIT and Microsoft researchers are using artificial intelligence to design molecular sensors for early detection. The researchers developed an AI model to design peptides (short proteins) that are targeted by enzymes called proteases, which are overactive in cancer cells. Nanoparticles coated with these peptides can act as sensors that give off a signal if cancer-linked proteases are present anywhere in the body. Depending on which proteases are detected, doctors would be able to diagnose the particular type of cancer that is present. These signals could be detected using a simple urine test that could even be done at home. "We're focused on ultra-sensitive detection in diseases like the early stages of cancer, when the tumor burden is small, or early on in recurrence after surgery," says Sangeeta Bhatia, the John and Dorothy Wilson Professor of Health Sciences and Technology and of Electrical Engineering and Computer Science at MIT, and a member of MIT's Koch Institute for Integrative Cancer Research and the Institute for Medical Engineering and Science (IMES). Bhatia and Ava Amini '16, a principal researcher at Microsoft Research and a former graduate student in Bhatia's lab, are the senior authors of the study, which appears today in Nature Communications. Carmen Martin-Alonso PhD '23, a founding scientist at Amplifyer Bio, and Sarah Alamdari, a senior applied scientist at Microsoft Research, are the paper's lead authors. Amplifying cancer signals More than a decade ago, Bhatia's lab came up with the idea of using protease activity as a marker of early cancer. The human genome encodes about 600 proteases, which are enzymes that can cut through other proteins, including structural proteins such as collagen. They are often overactive in cancer cells, as they help the cells escape their original locations by cutting through proteins of the extracellular matrix, which normally holds cells in place. The researchers' idea was to coat nanoparticles with peptides that can be cleaved by a specific protease. These particles could then be ingested or inhaled. As they traveled through the body, if they encountered any cancer-linked proteases, the peptides on the particles would be cleaved. Those peptides would be secreted in the urine, where they could be detected using a paper strip similar to a pregnancy test strip. Measuring those signals would reveal the overactivity of proteases deep within the body. "We have been advancing the idea that if you can make a sensor out of these proteases and multiplex them, then you could find signatures of where these proteases were active in diseases. And since the peptide cleavage is an enzymatic process, it can really amplify a signal," Bhatia says. The researchers have used this approach to demonstrate diagnostic sensors for lung, ovarian, and colon cancers. However, in those studies, the researchers used a trial-and-error process to identify peptides that would be cleaved by certain proteases. In most cases, the peptides they identified could be cleaved by more than one protease, which meant that the signals that were read could not be attributed to a specific enzyme. Nonetheless, using "multiplexed" arrays of many different peptides yielded distinctive sensor signatures that were diagnostic in animal models of many different types of cancer, even if the precise identity of the proteases responsible for the cleavage remained unknown. In their new study, the researchers moved beyond the traditional trial-and-error process by developing a novel AI system, named CleaveNet, to design peptide sequences that could be cleaved efficiently and specifically by target proteases of interest. Users can prompt CleaveNet with design criteria, and CleaveNet will generate candidate peptides likely to fit those criteria. In this way, CleaveNet enables users to tune the efficiency and specificity of peptides generated by the model, opening a path to improving the sensors' diagnostic power. "If we know that a particular protease is really key to a certain cancer, and we can optimize the sensor to be highly sensitive and specific to that protease, then that gives us a great diagnostic signal," Amini says. "We can leverage the power of computation to try to specifically optimize for these efficiency and selectivity metrics." For a peptide that contains 10 amino acids, there are about 10 trillion possible combinations. Using AI to search that immense space allows for prediction, testing, and identification of useful sequences much faster than humans would be able to find them, while also considerably reducing experimental costs. Predicting enzyme activity To create CleaveNet, the researchers developed a protein language model to predict the amino acid sequences of peptides, analogous to how large language models can predict sequences of text. For the training data, they used publicly available data on about 20,000 peptides and their interactions with different proteases from a family known as matrix metalloproteinases (MMPs). Using these data, the researchers trained one model to generate peptide sequences that are predicted to be cleaved by proteases. These sequences could then be fed into another model that predicted how efficiently each peptide would be cleaved by any protease of interest. To demonstrate this approach, the researchers focused on a protease called MMP13, which cancer cells use to cut through collagen and help them metastasize from their original locations. Prompting CleaveNet with MMP13 as a target allowed the models to design peptides that could be cut by MMP13 with considerable selectivity and efficiency. This cleavage profile is particularly useful for diagnostic and therapeutic applications. "When we set the model up to generate sequences that would be efficient and selective for MMP13, it actually came up with peptides that had never been observed in training, and yet these novel sequences did turn out to be both efficient and selective," Martin-Alonso says. "That was very exciting to see." This kind of selectivity could help to reduce the number of different peptides needed to diagnose a given type of cancer, to identify novel biomarkers, and to provide insight into specific biological pathways for study and therapeutic testing, the researchers say. Bhatia's lab is currently part of an ARPA-H funded project to create reporters for an at-home diagnostic kit that could potentially detect and distinguish between 30 different types of cancer, in early stages of disease, based on measurements of protease activity. These sensors could include detection of not only MMP-mediated cleavage, but other enzymes such as serine proteases and cysteine proteases. Peptides designed using CleaveNet could also be incorporated into cancer therapeutics such as antibody treatments. Using a specific peptide to attach a therapeutic such as a cytokine or small molecule drug to a targeting antibody could enable the medicine to be released only when the peptides are exposed to proteases in the tumor environment, improving efficacy and reducing side effects. Beyond direct applications in diagnostics and therapeutics, combining efforts from the ARPA-H work with this modeling framework could enable the creation of a comprehensive "protease activity atlas" that spans multiple protease classes and cancers. Such a resource could further accelerate research in early cancer detection, protease biology, and AI models for peptide design. The research was funded by La Caixa Foundation, the Ludwig Center at MIT, and the Marble Center for Cancer Nanomedicine.
[2]
AI-designed molecular sensors could enable early detection of cancer
Massachusetts Institute of TechnologyJan 6 2026 Detecting cancer in the earliest stages could dramatically reduce cancer deaths because cancers are usually easier to treat when caught early. To help achieve that goal, MIT and Microsoft researchers are using artificial intelligence to design molecular sensors for early detection. The researchers developed an AI model to design peptides (short proteins) that are targeted by enzymes called proteases, which are overactive in cancer cells. Nanoparticles coated with these peptides can act as sensors that give off a signal if cancer-linked proteases are present anywhere in the body. Depending on which proteases are detected, doctors would be able to diagnose the particular type of cancer that is present. These signals could be detected using a simple urine test that could even be done at home. "We're focused on ultra-sensitive detection in diseases like the early stages of cancer, when the tumor burden is small, or early on in recurrence after surgery," says Sangeeta Bhatia, the John and Dorothy Wilson Professor of Health Sciences and Technology and of Electrical Engineering and Computer Science at MIT, and a member of MIT's Koch Institute for Integrative Cancer Research and the Institute for Medical Engineering and Science (IMES). Bhatia and Ava Amini '16, a principal researcher at Microsoft Research and a former graduate student in Bhatia's lab, are the senior authors of the study, which appears today in Nature Communications. Carmen Martin-Alonso PhD '23, a founding scientist at Amplifyer Bio, and Sarah Alamdari, a senior applied scientist at Microsoft Research, are the paper's lead authors. Amplifying cancer signals More than a decade ago, Bhatia's lab came up with the idea of using protease activity as a marker of early cancer. The human genome encodes about 600 proteases, which are enzymes that can cut through other proteins, including structural proteins such as collagen. They are often overactive in cancer cells, as they help the cells escape their original locations by cutting through proteins of the extracellular matrix, which normally holds cells in place. The researchers' idea was to coat nanoparticles with peptides that can be cleaved by a specific protease. These particles could then be ingested or inhaled. As they traveled through the body, if they encountered any cancer-linked proteases, the peptides on the particles would be cleaved. Those peptides would be secreted in the urine, where they could be detected using a paper strip similar to a pregnancy test strip. Measuring those signals would reveal the overactivity of proteases deep within the body. "We have been advancing the idea that if you can make a sensor out of these proteases and multiplex them, then you could find signatures of where these proteases were active in diseases. And since the peptide cleavage is an enzymatic process, it can really amplify a signal," Bhatia says. The researchers have used this approach to demonstrate diagnostic sensors for lung, ovarian, and colon cancers. However, in those studies, the researchers used a trial-and-error process to identify peptides that would be cleaved by certain proteases. In most cases, the peptides they identified could be cleaved by more than one protease, which meant that the signals that were read could not be attributed to a specific enzyme. Nonetheless, using "multiplexed" arrays of many different peptides yielded distinctive sensor signatures that were diagnostic in animal models of many different types of cancer, even if the precise identity of the proteases responsible for the cleavage remained unknown. In their new study, the researchers moved beyond the traditional trial-and-error process by developing a novel AI system, named CleaveNet, to design peptide sequences that could be cleaved efficiently and specifically by target proteases of interest. Users can prompt CleaveNet with design criteria, and CleaveNet will generate candidate peptides likely to fit those criteria. In this way, CleaveNet enables users to tune the efficiency and specificity of peptides generated by the model, opening a path to improving the sensors' diagnostic power. "If we know that a particular protease is really key to a certain cancer, and we can optimize the sensor to be highly sensitive and specific to that protease, then that gives us a great diagnostic signal," Amini says. "We can leverage the power of computation to try to specifically optimize for these efficiency and selectivity metrics." For a peptide that contains 10 amino acids, there are about 10 trillion possible combinations. Using AI to search that immense space allows for prediction, testing, and identification of useful sequences much faster than humans would be able to find them, while also considerably reducing experimental costs. Predicting enzyme activity To create CleaveNet, the researchers developed a protein language model to predict the amino acid sequences of peptides, analogous to how large language models can predict sequences of text. For the training data, they used publicly available data on about 20,000 peptides and their interactions with different proteases from a family known as matrix metalloproteinases (MMPs). Using these data, the researchers trained one model to generate peptide sequences that are predicted to be cleaved by proteases. These sequences could then be fed into another model that predicted how efficiently each peptide would be cleaved by any protease of interest. To demonstrate this approach, the researchers focused on a protease called MMP13, which cancer cells use to cut through collagen and help them metastasize from their original locations. Prompting CleaveNet with MMP13 as a target allowed the models to design peptides that could be cut by MMP13 with considerable selectivity and efficiency. This cleavage profile is particularly useful for diagnostic and therapeutic applications. "When we set the model up to generate sequences that would be efficient and selective for MMP13, it actually came up with peptides that had never been observed in training, and yet these novel sequences did turn out to be both efficient and selective," Martin-Alonso says. "That was very exciting to see." This kind of selectivity could help to reduce the number of different peptides needed to diagnose a given type of cancer, to identify novel biomarkers, and to provide insight into specific biological pathways for study and therapeutic testing, the researchers say. Bhatia's lab is currently part of an ARPA-H funded project to create reporters for an at-home diagnostic kit that could potentially detect and distinguish between 30 different types of cancer, in early stages of disease, based on measurements of protease activity. These sensors could include detection of not only MMP-mediated cleavage, but other enzymes such as serine proteases and cysteine proteases. Peptides designed using CleaveNet could also be incorporated into cancer therapeutics such as antibody treatments. Using a specific peptide to attach a therapeutic such as a cytokine or small molecule drug to a targeting antibody could enable the medicine to be released only when the peptides are exposed to proteases in the tumor environment, improving efficacy and reducing side effects. Beyond direct applications in diagnostics and therapeutics, combining efforts from the ARPA-H work with this modeling framework could enable the creation of a comprehensive "protease activity atlas" that spans multiple protease classes and cancers. Such a resource could further accelerate research in early cancer detection, protease biology, and AI models for peptide design. The research was funded by La Caixa Foundation, the Ludwig Center at MIT, and the Marble Center for Cancer Nanomedicine. Massachusetts Institute of Technology Journal reference: https://www.nature.com/articles/s41467-025-67226-1
Share
Share
Copy Link
Researchers from MIT and Microsoft created CleaveNet, an AI model that designs peptide-based molecular sensors to detect cancer in its earliest stages. The system generates peptides targeted by cancer-linked proteases, enabling diagnosis through a simple urine test that could be performed at home.

Researchers from MIT and Microsoft have developed CleaveNet, an artificial intelligence system that designs molecular sensors capable of detecting cancer in its earliest stages. The breakthrough, published in Nature Communications, addresses a critical challenge in oncology: identifying tumors when they are small and most treatable. Sangeeta Bhatia, the John and Dorothy Wilson Professor of Health Sciences and Technology at MIT and member of the Koch Institute for Integrative Cancer Research, leads the research alongside Ava Amini, a principal researcher at Microsoft Research. The team's approach centers on using AI-generated sensors that target cancer-linked proteases, enzymes that become overactive in malignant cells
1
.The innovation builds on more than a decade of work exploring protease activity as a biomarker for early cancer detection. The human genome encodes approximately 600 proteases, which function as enzymes that cut through structural proteins like collagen. In cancer cells, these proteases help tumors escape their original locations by breaking down the extracellular matrix that normally anchors cells in place. By coating nanoparticles with specially designed peptides that can be cleaved by specific proteases, the researchers created sensors that travel through the body and release detectable signals when they encounter cancerous activity
2
.Previous research demonstrated diagnostic sensors for lung, ovarian, and colon cancers, but relied on a trial-and-error process to identify suitable peptides. This traditional approach had limitations: most identified peptides could be cleaved by multiple proteases, making it difficult to attribute signals to specific enzymes. While multiplexed arrays of different peptides produced distinctive signatures in animal models, the precise identity of responsible proteases remained unclear. CleaveNet changes this dynamic by enabling researchers to design peptide sequences that are cleaved efficiently and specifically by target proteases of interest
1
.The AI system allows users to input design criteria, and CleaveNet generates candidate peptides matching those specifications. This capability enables fine-tuning of both efficiency and specificity, directly improving diagnostic power. "If we know that a particular protease is really key to a certain cancer, and we can optimize the sensor to be highly sensitive and specific to that protease, then that gives us a great diagnostic signal," explains Amini. The computational approach addresses an enormous challenge: for a peptide containing just 10 amino acids, there are approximately 10 trillion possible combinations. Using AI to navigate this vast space accelerates discovery while substantially reducing experimental costs
2
.The practical application of this technology involves ingesting or inhaling nanoparticles coated with the AI-designed peptides. As these particles circulate through the body, they encounter proteases. When cancer-linked proteases are present, they cleave the peptides from the nanoparticles. These cleaved peptides are then secreted in urine, where they can be detected using a paper strip similar to a pregnancy test. "We're focused on ultra-sensitive detection in diseases like the early stages of cancer, when the tumor burden is small, or early on in recurrence after surgery," says Bhatia. The peptide cleavage process is enzymatic, which means it amplifies signals from deep within the body, making detection possible even when tumor burden is minimal
1
.Depending on which proteases are detected, physicians could diagnose the particular type of cancer present. This specificity represents a significant advance over previous methods. The potential for at-home testing could democratize access to early cancer screening, particularly for populations with limited healthcare access. The research team includes Carmen Martin-Alonso, a founding scientist at Amplifyer Bio, and Sarah Alamdari, a senior applied scientist at Microsoft Research, as lead authors. Their work demonstrates how protein language models can predict amino acid sequences optimized for specific diagnostic applications
2
.Related Stories
The development of CleaveNet represents a shift from reactive to proactive cancer detection strategies. By enabling ultra-sensitive detection when tumors are small, the technology could catch cancers at stages when treatment success rates are highest. The system's ability to identify specific cancer types through protease signatures also means more targeted treatment approaches from the outset. Short-term implications include potential clinical trials to validate the urine test in human populations, while longer-term prospects involve integration into routine screening protocols. The collaboration between MIT and Microsoft highlights how computational power and biological expertise converge to address complex medical challenges. Watch for developments in regulatory approval pathways and expansion to additional cancer types beyond lung, ovarian, and colon cancers already demonstrated in animal models.
Summarized by
Navi
25 Jul 2025•Science and Research

11 Jul 2025•Science and Research

24 Jun 2025•Science and Research

1
Policy and Regulation

2
Technology

3
Technology
