14 Sources
14 Sources
[1]
Which diseases will you have in 20 years? This AI makes accurate predictions
A new artificial intelligence (AI) tool can forecast a person's risk of developing more than 1,000 diseases, in some cases providing a prediction decades in advance. The model, called Delphi-2M, uses health records and lifestyle factors to estimate the likelihood a person will develop diseases such as cancer, skin diseases and immune conditions up to 20 years ahead of time. Although Delphi-2M was trained only on one data set from the United Kingdom, its multi-disease modelling could one day help clinicians to identify high-risk people, allowing for the early roll-out of preventive measures. The model is described in a study published today in Nature. The tool's ability to model multiple diseases in one go is "astonishing", says Stefan Feuerriegel, a computer scientist at the Ludwig Maximilian University of Munich in Germany, who has developed AI models for medical applications. "It can generate entire future health trajectories," he says. Researchers have already developed AI-based tools to predict a person's risk of developing certain conditions, including some cancers and cardiovascular disease. But most of these tools estimate the risk of only one disease, says study co-author Moritz Gerstung, a data scientist at the German Cancer Research Center in Heidelberg. "A health-care professional would have to run dozens of them to deliver a comprehensive answer," he says. To address this, Gerstung and his colleagues modified a type of large language model (LLM) called a generative pre-trained transformer (GPT), that forms the underpinning of AI chatbots such as ChatGPT. When asked a question, GPTs provide outputs that, according to their training on vast volumes of data, are statistically probable. The authors designed their modified LLM to forecast a person's likelihood of developing 1,258 diseases on the basis of their past medical history. The model also incorporates the person's age, sex, body mass index and health-related habits, such as tobacco use and alcohol consumption. The researchers trained Delphi-2M on data from 400,000 participants of the UK Biobank, a long-term biomedical monitoring study. For most diseases, Delphi-2M's predictions matched or exceeded the accuracy of those of current models that estimate the risk of developing a single illness. The tool also performed better than a machine-learning algorithm that uses biomarkers -- levels of specific molecules or compounds in the body -- to predict the risk of several diseases. "It worked astonishingly well," says Gerstung. Delphi-2M worked best when forecasting the trajectories of conditions that follow predictable patterns of progression, such as some types of cancer. The model calculated the probability of a person developing each illness for a time period of up to two decades, depending on the information included in their medical records. Gerstung and his colleagues tested Delphi-2M on health data from 1.9 million people in the Danish National Patient Registry, a national database that has tracked hospital admissions for almost half a century. The authors found that the model's predictions for people in the registry were only slightly less accurate than they were for participants in the UK Biobank. This demonstrates that the model could still make somewhat reliable predictions when it's applied to data sets from national health systems other than the one it trained on, says Gerstung. Delphi-2M is an "intriguing" contribution to the burgeoning field of modelling multiple diseases at once, but it has its limitations, says Degui Zhi, a bioinformatics researcher who develops AI models at the University of Texas Health Science Center at Houston. For instance, the UK Biobank data only captured participantspeople' first brush with a disease. The number of times someone has had an illness is "important for the modelling of personal health trajectories", says Zhi. Gerstung and his colleagues will evaluate Delphi-2M's accuracy on data sets from several countries to expand its scope. "Thinking about how this information can be combined for developing even more precise algorithms will be important," he says.
[2]
New AI Tool Predicts Which of 1,000 Diseases Someone May Develop in 20 Years
I agree my information will be processed in accordance with the Scientific American and Springer Nature Limited Privacy Policy. A new artificial intelligence (AI) tool can forecast a person's risk of developing more than 1,000 diseases, in some cases providing a prediction decades in advance. The model, called Delphi-2M, uses health records and lifestyle factors to estimate the likelihood that a person will develop diseases such as cancer, skin diseases and immune conditions up to 20 years ahead of time. Although Delphi-2M was trained only on one data set from the United Kingdom, its multi-disease modelling could one day help clinicians to identify high-risk people, allowing for the early roll-out of preventive measures. The model is described in a study published today in Nature. The tool's ability to model multiple diseases in one go is "astonishing," says Stefan Feuerriegel, a computer scientist at the Ludwig Maximilian University of Munich in Germany, who has developed AI models for medical applications. "It can generate entire future health trajectories," he says. If you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today. Researchers have already developed AI-based tools to predict a person's risk of developing certain conditions, including some cancers and cardiovascular disease. But most of these tools estimate the risk of only one disease, says study co-author Moritz Gerstung, a data scientist at the German Cancer Research Center in Heidelberg. "A health-care professional would have to run dozens of them to deliver a comprehensive answer," he says. To address this, Gerstung and his colleagues modified a type of large language model (LLM) called a generative pre-trained transformer (GPT), that forms the underpinning of AI chatbots such as ChatGPT. When asked a question, GPTs provide outputs that, according to their training on vast volumes of data, are statistically probable. The authors designed their modified LLM to forecast a person's likelihood of developing 1,258 diseases on the basis of their past medical history. The model also incorporates the person's age, sex, body mass index and health-related habits, such as tobacco use and alcohol consumption. The researchers trained Delphi-2M on data from 400,000 participants of the UK Biobank, a long-term biomedical monitoring study. For most diseases, Delphi-2M's predictions matched or exceeded the accuracy of those of current models that estimate the risk of developing a single illness. The tool also performed better than a machine-learning algorithm that uses biomarkers -- levels of specific molecules or compounds in the body -- to predict the risk of several diseases. "It worked astonishingly well," says Gerstung. Delphi-2M worked best when forecasting the trajectories of conditions that follow predictable patterns of progression, such as some types of cancer. The model calculated the probability of a person developing each illness for a time period of up to two decades, depending on the information included in their medical records. Gerstung and his colleagues tested Delphi-2M on health data from 1.9 million people in the Danish National Patient Registry, a national database that has tracked hospital admissions for almost half a century. The authors found that the model's predictions for people in the registry were only slightly less accurate than they were for participants in the UK Biobank. This demonstrates that the model could still make somewhat reliable predictions when it's applied to data sets from national health systems other than the one it trained on, says Gerstung. Delphi-2M is an "intriguing" contribution to the burgeoning field of modelling multiple diseases at once, but it has its limitations, says Degui Zhi, a bioinformatics researcher who develops AI models at the University of Texas Health Science Center at Houston. For instance, the UK Biobank data only captured participants' first brush with a disease. The number of times someone has had an illness is "important for the modelling of personal health trajectories," says Zhi. Gerstung and his colleagues will evaluate Delphi-2M's accuracy on data sets from several countries to expand its scope. "Thinking about how this information can be combined for developing even more precise algorithms will be important," he says.
[3]
New AI model predicts susceptibility to over 1,000 diseases
European scientists have developed a new artificial intelligence model, trained on large-scale health records, which can predict susceptibility to more than 1,000 diseases decades into the future. The generative AI system called Delphi-2M was built at the European Molecular Biology Laboratory in Cambridge, using "similar architecture to large language models but with key innovations to work with healthcare data", said Tom Fitzgerald of EMBL. Delphi was trained on anonymised medical records from 400,000 participants in UK Biobank. The researchers then tested the model successfully on data from 1.9mn patients in the Danish National Patient Registry. The predictions across more than 1,000 diseases generally matched the accuracy of existing tools that have a far narrower focus, such as the QRisk score for heart conditions. Results were published in Nature on Wednesday. "Our model is a proof of concept, showing that it's possible for AI to learn many of our long-term health patterns and use this information to generate meaningful predictions," said Ewan Birney, EMBL's interim executive director. "We were surprised at how well the model transferred from the UK to Denmark though it had never seen a single bit of Danish data." Developing Delphi into a forecasting tool suitable for routine clinical use with individual patients could take five to 10 years, added Birney, but it will be available much sooner to guide healthcare strategies. "Although it makes predictions for each individual, it can be very useful at the population level to forecast collective healthcare needs, how many people will suffer from particular diseases such heart attacks, cancers or diabetes and what sort of treatment they need," said Moritz Gerstung, head of AI at the German Cancer Research Center in Heidelberg, another member of the Delphi team. The model gives the best predictions for conditions with consistent patterns of progression, including cardiovascular disease, diabetes and blood poisoning. It works less well for diseases with unpredictable external causes and for very rare congenital conditions. The researchers are now working to extend Delphi by also incorporating biological data about individuals' genes and proteins. But Birney said they were "very pleasantly surprised" by how well it performed with healthcare information alone, giving predictions as good as or better than other models that use genomics and proteomics. "I want to stress the power of the straightforward medical record," he added. The authors have patented some of the key ideas behind Delphi's prediction of the risk and timing of disease. "We are exploring whether there are commercialisation possibilities and how to do that with our respective institutions," said Birney. "This research looks to be a significant step towards scalable, interpretable, and -- most importantly -- ethically responsible form of predictive modelling in medicine," said Gustavo Sudre, professor of genomic neuroimaging and AI at King's College London, who was not involved in the project. "While the current version relies solely on anonymised clinical records, it is encouraging to see that the model architecture has been designed to accommodate richer data types, such as biomarkers, imaging and even genomics."
[4]
AI model maps lifetime disease risks to transform future healthcare planning
By Priyanjana Pramanik, MSc.Reviewed by Susha Cheriyedath, M.Sc.Sep 18 2025 A powerful new AI predicts how over 1,000 diseases may unfold across a person's life, opening doors for precision prevention, policy planning, and bias-aware healthcare innovation. Study: Learning the natural history of human disease with generative transformers. Image Credit: Song_about_summer / Shutterstock In a recent study published in the journal Nature, researchers developed a machine learning model that utilized large-scale health data to predict the progression of 1,256 distinct ICD-10 level 3 diseases based on patients' past medical histories. The model demonstrated predictive accuracy comparable to existing tools that analyze individual diseases. It showed potential for simulating future health trajectories over a period of up to two decades and provided insights into personalized health risks and comorbidities. A need for complex disease models Human disease progression involves periods of health, acute illness, and chronic conditions, often appearing as clusters of comorbidities influenced by genetics, lifestyle, and socioeconomic factors. Understanding these patterns is crucial for delivering personalized healthcare, providing lifestyle guidance, and implementing effective early screening programs. However, traditional algorithms are primarily designed for single diseases and cannot capture the complexity of over 1,000 recognized health conditions. This limitation becomes especially important in the context of aging populations, where the burden of illnesses such as cancer, diabetes, cardiovascular disease, and dementia is projected to rise significantly over the coming decades. Accurately modeling disease trajectories is therefore vital for both healthcare planning and economic policy. Artificial intelligence, particularly large language models (LLMs), provides a promising solution. These models excel at learning dependencies across sequences of data, much like predicting disease based on prior health events. Inspired by this analogy, researchers have developed transformer-based models for predicting specific conditions, with encouraging early results. Yet, despite these advances, a truly comprehensive and generative model capable of simulating the full spectrum of multimorbidity across time has not been systematically evaluated. Developing a large-scale data model The researchers created Delphi-2M, a transformer-based model, to predict lifetime disease trajectories. Unlike language models that process words, Delphi-2M worked with diagnostic codes from the tenth revision of the International Classification of Diseases (ICD-10), as well as death, sex, BMI, and lifestyle factors such as smoking and alcohol use. To address gaps in medical records, the team inserted artificial "no-event" tokens. It included sex and lifestyle tokens, with a vocabulary spanning disease codes, lifestyle levels, sex, no-event, and padding tokens (around 1,270 total). Training was based on large-scale health records from the UK Biobank, comprising 402,799 participants for training, 100,639 for validation, and 471,057 for longitudinal testing. To test generalizability, the model was also validated on data from 1.93 million Danish individuals. Several modifications tailored the base model to health data: replacing positional encoding with continuous age encoding, adding an output head to predict time-to-next event, and altering attention masks to prevent tokens at the same time point from influencing one another. Delphi-2M could estimate risks for more than 1,000 diseases, forecast the timing of diagnoses, and simulate complete health trajectories. Performance was optimized through hyperparameter tuning, resulting in a 2.2M parameter model that combined predictive accuracy with generative capacity, providing a novel approach to modeling multimorbidity and long-term health progression. a, Schematic of health trajectories based on ICD-10 diagnoses, lifestyle, and healthy padding tokens, each recorded at a distinct age. b, Training, validation, and testing data derived from the UK Biobank (left) and Danish disease registries (right). c, The Delphi model architecture. The red elements indicate changes compared with the underlying GPT-2 model. 'N ×' denotes applying the transformer block sequentially N times. d, Example model input (prompt) and output (samples) comprising (age: token) pairs. e, Scaling laws of Delphi, showing the optimal validation loss as a function of model parameters for different training data sizes. f, Ablation results measured by the cross-entropy differences relative to an age- and sex-based baseline (y axis) for different ages (x axis). g, The accuracy of predicted time to event. The observed (y-axis) and expected (x-axis) times to events are shown for each next-token prediction (grey dots). The blue line shows the average across consecutive bins of the x-axis. Evaluating the model's performance Delphi-2M's performance was evaluated using health data up to age 60 from 63,622 participants in the UK Biobank. The model generated simulated health trajectories and compared them with tangible outcomes. Predictions of disease rates at ages 70 and 75 closely matched observed patterns, confirming its ability to capture population-level incidence trends. While predictive accuracy declined over longer time horizons, from an average AUC of approximately 0.76 to about 0.70 at 10 years, Delphi-2M still outperformed models based only on age and sex. The model effectively distinguished risks across subgroups defined by lifestyle or previous illnesses, supporting its value for personalized risk profiling. Importantly, Delphi-2M could also generate synthetic health trajectories that mirrored real-world disease patterns without duplicating individual records. A model trained solely on this synthetic data retained much of the original's performance, showing only a three-point drop in AUC. This highlights potential applications for privacy-preserving research. To interpret predictions, researchers examined the embedding space, which revealed disease clusters consistent with ICD-10 chapters and showed how specific diagnoses shaped outcomes, such as the strong impact of pancreatic cancer on mortality. External validation on Danish data confirmed generalizability, with an average AUC of about 0.67, though with a modest performance drop. Finally, the study acknowledged its limitations, including biases in the UK Biobank recruitment process and patterns of missing data. Conclusions The study introduced Delphi-2M, a GPT-based model capable of predicting and simulating the progression of multiple diseases over time. Compared with single-disease or biomarker-based models, Delphi-2M showed strong accuracy in forecasting health risks across more than 1,000 conditions. For diabetes risk, however, it performed lower than the single-marker HbA1c approach, although with only a modest decline in performance when tested on Danish data. Its ability to sample synthetic future trajectories allows estimation of long-term disease burdens and the creation of privacy-preserving datasets. The model also highlighted patterns of comorbidities and temporal influences of illnesses, such as persistent mortality risks from cancer, and achieved an AUC of about 0.97 for predicting death. However, several limitations were noted. Predictions reflected biases in UK Biobank data, including healthy volunteer effects, recruitment bias, and missingness patterns. Differences were also seen across ancestry and socioeconomic groups. Importantly, the model captures statistical associations but not causal relationships, which limits its direct clinical use. Overall, Delphi-2M demonstrates the promise of transformer-based models for personalized risk prediction, healthcare planning, and biomedical research. Future improvements may integrate multimodal data, support clinical decision-making, and aid policy development in ageing populations. Journal reference: Learning the natural history of human disease with generative transformers. Shmatko, A., Jung, A.W., Gaurav, K., Brunak, S., Mortensen, L.H., Birney, E., Fitzgerald, T., Gerstung, M. Nature (2025). DOI: 10.1038/s41586-025-09529-3, https://www.nature.com/articles/s41586-025-09529-3
[5]
AI can forecast your future health - just like the weather
The model is best at predicting diseases like type 2 diabetes, heart attacks and sepsis that have a clear disease progression, rather than more random events like infections. People are already offered a cholesterol-lowering statin based on a calculation of their risk of a heart attack or stroke. The AI tool is not ready for clinical use, but the plan is to use it in a similar way, to spot high-risk patients while there is an opportunity to intervene early and prevent disease. This could include medicines or specific lifestyle advice - such as people likely to develop some liver disorders benefitting from cutting back their alcohol intake more than the general population. The artificial intelligence could also help inform disease-screening programmes and analyse all the healthcare records in an area to anticipate demand - such as how many heart attacks a year there will be in Norwich in 2030, to help plan resources. "This is the beginning of a new way to understand human health and disease progression," said Prof Moritz Gerstung, head of the division of AI in oncology at DKFZ, the German Cancer Research Centre. He added: "Generative models such as ours could one day help personalise care and anticipate healthcare needs at scale."
[6]
AI model forecasts disease risk decades in advance
Researchers have developed a generative AI model that uses large-scale health records to estimate how human health may change over time. It can forecast the risk and timing of over 1,000 diseases and predict health outcomes over a decade in advance. This new generative AI model was custom-built using algorithmic concepts similar to those used in large language models (LLMs). It was trained on anonymized patient data from 400,000 participants from the UK Biobank. Researchers also successfully tested the model using data from 1.9 million patients in the Danish National Patient Registry. This approach is one of the most comprehensive demonstrations to date of how generative AI can model human disease progression at scale and was tested on data from two entirely separate health care systems. "Our AI model is a proof of concept, showing that it's possible for AI to learn many of our long-term health patterns and use this information to generate meaningful predictions," said Ewan Birney, Interim Executive Director at the European Molecular Biology Laboratory (EMBL). "By modeling how illnesses develop over time, we can start to explore when certain risks emerge and how best to plan early interventions. It's a big step towards more personalized and preventive approaches to health care." This work, published in the journal Nature, was a collaboration between EMBL, the German Cancer Research Center (DKFZ), and the University of Copenhagen. AI for health forecasting Just as large language models can learn the structure of sentences, this AI model learns the "grammar" of health data to model medical histories as sequences of events unfolding over time. These events include medical diagnoses or lifestyle factors such as smoking. The model learns to forecast disease risk from the order in which such events happen and how much time passes between these events. "Medical events often follow predictable patterns," said Tom Fitzgerald, Staff Scientist at EMBL's European Bioinformatics Institute (EMBL-EBI). "Our AI model learns those patterns and can forecast future health outcomes. It gives us a way to explore what might happen based on a person's medical history and other key factors. Crucially, this is not a certainty, but an estimate of the potential risks." The model performs especially well for conditions with clear and consistent progression patterns, such as certain types of cancer, heart attacks, and septicemia, which is a type of blood poisoning. However, the model is less reliable for more variable conditions, such as mental health disorders or pregnancy-related complications that depend on unpredictable life events. Future use and limitations Like weather forecasts, this new AI model provides probabilities, not certainties. It doesn't predict exactly what will happen to an individual, but it offers well-calibrated estimates of how likely certain conditions are to occur over a given period. For example, it could predict the chance of developing heart disease within the next year. These risks are expressed as rates over time, similar to forecasting a 70% chance of rain tomorrow. Generally, forecasts over a shorter period of time have higher accuracy than long-range ones. For example, the model predicts varying levels of risk for heart attacks. Taking the UK BioBank cohort at the age of 60-65, the risk of heart attack varies from a chance of four in 10,000 per year for some men to approximately one in 100 in other men, depending on their prior diagnoses and lifestyle. Women have a lower risk on average, but a similar spread of risk. Moreover, the risks increase, on average, as people age. A systematic assessment on data from the UK Biobank not used for training showed that these calculated risks correspond well to the observed number of cases across age and sex groups. The model is calibrated to produce accurate population-level risk estimates, forecasting how often certain conditions occur within groups of people. However, like any AI model, it has limitations. For example, because the model's training data from the UK Biobank comes primarily from individuals aged 40-60, childhood and adolescent health events are underrepresented. The model also contains demographic biases due to gaps in the training data, including the underrepresentation of certain ethnic groups. While the model isn't ready for clinical use, it could already help researchers: * understand how diseases develop and progress over time, * explore how lifestyle and past illnesses affect long-term disease risk, * simulate health outcomes using artificial patient data, in situations where real-world data are difficult to obtain or access. In the future, similar AI tools trained on more representative datasets could assist clinicians in identifying high-risk patients early. With aging populations and rising rates of chronic illness, being able to forecast future health needs could help health care systems plan better and allocate resources more efficiently. But much more testing, consultation, and robust regulatory frameworks are needed before AI models can be deployed in a clinical setting. "This is the beginning of a new way to understand human health and disease progression," said Moritz Gerstung, Head of the Division of AI in Oncology at DKFZ and former Group Leader at EMBL-EBI. "Generative models such as ours could one day help personalize care and anticipate health care needs at scale. By learning from large populations, these models offer a powerful lens into how diseases unfold, and could eventually support earlier, more tailored interventions." This AI model was trained using anonymized health data under strict ethical rules. UK Biobank participants gave informed consent, and Danish data were accessed in accordance with national regulations that require the data to remain within Denmark. Researchers used secure, virtual systems to analyze the data without moving them across borders. These safeguards help ensure that AI models are developed and used in ways that respect privacy and uphold ethical standards.
[7]
This new AI tool can predict your risk of 1,000+ diseases -- meet Delphi-2M
A group of researchers has unveiled an AI model that can predict a person's likelihood of developing more than 1,000 diseases, and the AI even estimates when those illnesses might strike. Described this week in Nature, the AI is code named Delphi-2M and the tool was trained on anonymized health data from nearly 2.3 million people in the UK and Denmark. This marks one of the largest efforts yet to use generative AI to map out the future of human health. Different than traditional health calculators that only cover specific conditions (e.g. heart disease, diabetes), Delphi-2M takes a holistic approach. This means that the AI tool actually simulates possible trajectories of your health over decades, forecasting sequences of complications, including illnesses, sleep patterns and other aspects affecting health. Built on the same technology behind chatbots like ChatGPT, what makes Delphi-2M different than an LLM is that it is designed to handle medical histories not text. Each diagnosis, demographic detail or lifestyle factor is encoded like a "token," letting the AI analyze the progression of disease in the same way a language model predicts and writes the next word. Key inputs include: Utilizing this vital, yet somewhat basic information, the model then predicts both the next disease a patient might face and the length of time until that disease will appear. In tests, it reached an average accuracy score (AUC) of 0.76 across hundreds of diseases in the UK dataset; a strong result given the complexity of human health. As mentioned in the study, when researchers asked Delphi-2M to generate synthetic health futures for people at age 60, the projections closely matched population-level outcomes a decade later. That suggests it could become a powerful tool for public health planning such as in identifying which diseases are likely to surge among the future generations. As with any AI, the technology isn't perfect and there are caveats. Accuracy dropped when applied to Danish data, showing the model isn't equally reliable across populations. Additionally, like all predictive AI, the model reflects the biases of the datasets it was trained on. For example, the UK Biobank data skews toward wealthier, healthier participants, which could distort risk estimates for underrepresented groups. It's important to remember that human oversight is absolutely necessary and AI is not a replacement for a human doctor. That's why researchers warn that Delphi-2M is not a diagnostic tool, at least not for the time being. Instead, researchers are thinking of it more as a useful forecasting engine that can detect general risks and planning preventive care. Predicting that you're at high risk for cancer at 72 doesn't mean it will happen, only that you resemble people who developed it in the training data. That said, the possibilities are striking. It's possible that more AI models like Delphi-2M could sit alongside existing health calculators, offering patients and doctors more personalized roadmaps of future risk, even surfacing actionable steps to delay or prevent illness. Although it's still research, the promise of AI-guided medicine comes with many questions. Can the same generative tech that allows ChatGPT or Claude to write code also be the same AI that predicts disease? Delphi-2M hints at a future where your doctor might use AI to scan decades of your potential health journey, helping you take preventive action long before symptoms appear. Follow Tom's Guide on Google News and add us as a preferred source to get our up-to-date news, analysis, and reviews in your feeds. Make sure to click the Follow button!
[8]
New AI tool can predict a person's risk of more than 1,000 diseases, say experts
Delphi-2M uses diagnoses, 'medical events' and lifestyle factors to create forecasts for next decade and beyond Scientists have developed a new artificial intelligence tool that can predict your personal risk of more than 1,000 diseases, and forecast changes in health a decade in advance. The generative AI tool was custom-built by experts from the European Molecular Biology Laboratory (EMBL), the German Cancer Research Centre and the University of Copenhagen, using algorithmic concepts similar to those used in large language models (LLMs). It is one of the most comprehensive demonstrations to date of how generative AI can model human disease progression at scale, and was trained on data from two entirely separate healthcare systems. Details of the breakthrough were published in the journal Nature. "Medical events often follow predictable patterns," said Tomas Fitzgerald, a staff scientist at EMBL's European Bioinformatics Institute (EMBL-EBI). "Our AI model learns those patterns and can forecast future health outcomes." The tool works by assessing the probability of whether - and when - someone may develop diseases such as cancer, diabetes, heart disease, respiratory disease and many other disorders. Named Delphi-2M, it looks for "medical events" in a patient's history, such as when illnesses were diagnosed, together with lifestyle factors such as whether they are or were obese, smoked or drank alcohol, plus their age and sex. The tool also looks at anonymised patient record data to predict what might happen over the next decade and beyond. The tool was trained and tested on anonymised patient data from 400,000 people in the UK Biobank study and 1.9 million patients in the Danish national patient registry. Health risks are expressed as rates over time, similar to forecasting a 70% chance of rain at the weekend. Ewan Birney, the EMBL interim executive director, said patients might be able to benefit from the tool within the next few years. "You walk into the doctor's surgery and the clinician is very used to using these tools, and they are able to say: 'Here's four major risks that are in your future and here's two things you could do to really change that.' "I suspect everyone will be told to lose weight, and if you smoke you will be told to stop smoking - and that will be in your data so that advice isn't going to change remarkably - but for some diseases I think there will be some very specific things. That's the future we want to create." He said the advantage of the new AI tool over existing ones - such as the Qrisk method of calculating risk of having a heart attack or stroke over the next decade - was "we can do all diseases at once and over a long time period. That is the thing that single disease models can't do." The team said: "Delphi-2M predicts the rates of more than 1,000 diseases, conditional on each individual's past disease history, with accuracy comparable to that of existing single-disease models. "Delphi-2M's generative nature also enables sampling of synthetic future health trajectories, providing meaningful estimates of potential disease burden for up to 20 years." Prof Moritz Gerstung, the head of the division of AI in oncology at the German Cancer Research Centre, said: "This is the beginning of a new way to understand human health and disease progression. "Generative models such as ours could one day help personalise care and anticipate healthcare needs at scale."
[9]
Scientists train AI model to predict future illnesses
Scientists said Wednesday that they had created an AI model able to predict medical diagnoses years in advance, building on the same technology behind consumer chatbots like ChatGPT. Based on a patient's case history, the Delphi-2M AI "predicts the rates of more than 1,000 diseases" years into the future, the team from British, Danish, German and Swiss institutions wrote in a paper published in the journal Nature. Researchers trained the model on data from Britain's UK Biobank -- a large-scale biomedical research database with details on about half a million participants. Neural networks based on so-called "transformer" architecture -- the "T" in "ChatGPT" -- most famously tackle language-based tasks, as in the chatbot and its many imitators and competitors. But understanding a sequence of medical diagnoses is "a bit like learning the grammar in a text," German Cancer Research Center AI expert Moritz Gerstung told journalists. Delphi-2M "learns the patterns in health care data, preceding diagnoses, in which combinations they occur and in which succession", he said, enabling "very meaningful and health-relevant predictions". Gerstung presented charts suggesting the AI could single out people at far higher or lower risk of suffering a heart attack than their age and other factors would predict. The team verified Delphi-2M's performance by testing it against data from almost two million people in Denmark's public health database. But Gerstung and fellow team members stressed that the Delphi-2M tool needed further testing and was not yet ready for clinical use. "This is still a long way from improved health care as the authors acknowledge that both (British and Danish) datasets are biased in terms of age, ethnicity and current health care outcomes," commented health technology researcher Peter Bannister, a fellow at Britain's Institution of Engineering and Technology. But in future systems like Delphi-2M could help "guide the monitoring and possibly earlier clinical interventions for effectively a preventative type of medicine", Gerstung said. On a larger scale, such tools could help with "optimization of resources across a stretched health care system", European Molecular Biology Laboratory co-author Tom Fitzgerald said. Doctors in many countries already use computer tools to predict risk of disease, such as the QRISK3 program that British family doctors use to assess the danger of heart attack or stroke. Delphi-2M, by contrast, "can do all diseases at once and over a long time period", said co-author Ewan Birney. Gustavo Sudre, a King's College London professor specializing in medical AI, commented that the research "looks to be a significant step towards scalable, interpretable and -- most importantly -- ethically responsible predictive modeling". "Interpretable" or "explainable" AI is one of the top research goals in the field, as the full inner workings of many large AI models currently remain mysterious even to their creators.
[10]
New AI System Predicts Risk of 1,000 Diseases Years in Advance - Decrypt
Researchers have built an AI system that predicts your risk of developing more than 1,000 diseases up to 20 years before symptoms appear, according to a study published in Nature this week. The model, called Delphi-2M, achieved 76% accuracy for near-term health predictions and maintained 70% accuracy even when forecasting a decade into the future. It outperformed existing single-disease risk calculators while simultaneously assessing risks across the entire spectrum of human illness. "The progression of human disease across age is characterized by periods of health, episodes of acute illness and also chronic debilitation, often manifesting as clusters of co-morbidity," the researchers wrote. "Few algorithms are capable of predicting the full spectrum of human disease, which recognizes more than 1,000 diagnoses at the top level of the International Classification of Diseases, Tenth Revision (ICD-10) coding system." The system learned these patterns from 402,799 UK Biobank participants, then proved its mettle on 1.9 million Danish health records without any additional training. Before you start rubbing your hands with the idea of your own medical predictor, can you try Delphi-2M yourself? Not exactly. The trained model and its weights are locked behind UK Biobank's controlled access procedures -- meaning researchers only. The codebase for training your own version is on GitHub under an MIT license, so you could technically build your own model, but you'd need access to massive medical datasets to make it work. For now, this remains a research tool, not a consumer app. The technology works by treating medical histories as sequences -- much like ChatGPT processes text. Each diagnosis, recorded with the age it first occurred, becomes a token. The model reads this medical "language" and predicts what comes next. With the proper information and training, you can predict the next token (in this case, the next illness) and the estimated time before that "token" is generated (how long until you get sick if the most likely set of events occurs). For a 60-year-old with diabetes and high blood pressure, Delphi-2M might forecast a 19-fold increased risk of pancreatic cancer. Add a pancreatic cancer diagnosis to that history, and the model calculates mortality risk jumping nearly ten thousandfold. The transformer architecture behind Delphi-2M represents each person's health journey as a timeline of diagnostic codes, lifestyle factors like smoking and BMI, and demographic data. "No event" padding tokens fill the gaps between medical visits, teaching the model that the simple passage of time changes baseline risk. This is also similar to how normal LLMs can understand text even if they miss some words or even sentences. When tested against established clinical tools, Delphi-2M matched or exceeded their performance. For cardiovascular disease prediction, it achieved an AUC of 0.70 compared to 0.69 for AutoPrognosis and 0.71 for QRisk3. For dementia, it hit 0.81 versus 0.81 for UKBDRS. The key difference: those tools predict single conditions. Delphi-2M evaluates everything at once. Beyond individual predictions, the system generates entire synthetic health trajectories. Starting from age 60 data, it can simulate thousands of possible health futures, producing population-level disease burden estimates accurate to within statistical margins. One synthetic dataset trained a secondary Delphi model that achieved 74% accuracy -- just three percentage points below the original. The model revealed how diseases influence each other over time. Cancers increased mortality risk with a "half-life" of several years, while septicemia's effect dropped sharply, returning to near-baseline within months. Mental health conditions showed persistent clustering effects, with one diagnosis strongly predicting others in that category years later. The system does have boundaries. Its 20-year predictions drop to around 60-70% accuracy in general, but things will depend on which type of disease and conditions it tries to analyze and forecast. "For 97% of diagnoses, the AUC was greater than 0.5, indicating that the vast majority followed patterns with at least partial predictability," the study says, adding later on that "Delphi-2M's average AUC values decrease from an average of 0.76 to 0.70 after 10 years," and that "iIn the first year of sampling, there are on average 17% disease tokens that are correctly predicted, and this drops to less than 14% 20 years later." In other words, this model is quite good at predicting things under relevant scenarios, but a lot can change in 20 years, so it's not Nostradamus. Rare diseases and highly environmental conditions prove harder to forecast. The UK Biobank's demographic skew -- mostly white, educated, relatively healthy volunteers -- introduces bias that the researchers acknowledge needs addressing. Danish validation revealed another limitation: Delphi-2M learned some UK-specific data collection quirks. Diseases recorded primarily in hospital settings appeared artificially inflated, contradicting the data registered by the Danish people. The model predicted septicemia at eight times the normal rate for anyone with prior hospital data, partly because 93% of UK Biobank septicemia diagnoses came from hospital records. The researchers trained Delphi-2M using a modified GPT-2 architecture with 2.2 million parameters -- tiny compared to modern language models but sufficient for medical prediction. Key modifications included continuous age encoding instead of discrete position markers and an exponential waiting time model to predict when events would occur, not just what would happen. Each health trajectory in the training data contained an average of 18 disease tokens spanning birth to age 80. Sex, BMI categories, smoking status, and alcohol consumption added context. The model learned to weigh these factors automatically, discovering that obesity increased diabetes risk while smoking elevated cancer probabilities -- relationships that medicine has long established but that emerged without explicit programming. It's truly an LLM for health conditions. For clinical deployment, several hurdles remain. The model needs validation across more diverse populations -- for example, the lifestyles and habits of people from Nigeria, China, and America can be very different, making the model less accurate. Also, privacy concerns around using detailed health histories require careful handling. Integration with existing healthcare systems poses technical and regulatory challenges. But the potential applications span from identifying screening candidates who don't meet age-based criteria to modeling population health interventions. Insurance companies, pharmaceutical firms, and public health agencies may have obvious interests. Delphi-2M joins a growing family of transformer-based medical models. Some examples include Harvard's PDGrapher tool for predicting gene-drug combinations that could reverse diseases such as Parkinson's or Alzheimer's, an LLM specifically trained on protein connections, Google's AlphaGenome model trained on DNA pairs, and others. What makes Delphi-2M so interesting and different is its broad scope of action, the sheer breadth of diseases covered, its long prediction horizon, and its ability to generate realistic synthetic data that preserves statistical relationships while protecting individual privacy. In other words: "How long do I have?" may soon be less a rhetorical question and more a predictable data point.
[11]
New AI tool can predict risk of more than 1,000 medical conditions
The model can accurately predict conditions such as diabetes and cancer, and is less reliable for issues such as infectious diseases and mental health disorders. Scientists say they have created a new artificial intelligence (AI) tool that can predict people's risk of developing more than 1,000 medical conditions. The model can forecast whether a patient will develop certain cancers, heart attacks, and other conditions more than a decade before they are formally diagnosed, according to the study, which was published in the journal Nature on Wednesday. "This is the beginning of a new way to understand human health and disease progression," Moritz Gerstung, head of the AI in oncology team at the German Cancer Research Centre (DKFZ), said in a statement. As AI advances into the medical field, the researchers said the new tool is one of the largest examples to date of how AI could be used to help patients. The model was trained using anonymised data from 400,000 people in the United Kingdom and then tested using data from 1.9 million people in Denmark. It learned to identify patterns that, over time, tend to culminate in serious health conditions - for example, using previous diagnoses and smoking histories. Considering both the order of these events and the amount of time between them, the model could then predict a patient's risk of developing various diseases. The researchers stressed that the model's predictions do not mean a patient will definitely get sick - rather that they are at higher risk. They compared it to a weather forecast. They also said that the model is more accurate with conditions with "consistent progression patterns," such as certain forms of cancer, diabetes, heart attacks, and a type of blood poisoning known as septicaemia. It is also generally more accurate over the near-term than the long-term. However, the model is less reliable for issues that are harder to anticipate, such as mental health problems, infectious diseases, and pregnancy-related complications. The model is proof that AI can "learn many of our long-term health patterns and use this information to generate meaningful predictions," said Ewan Birney, interim head of the European Molecular Biology Laboratory (EMBL). In a statement, Birney called the model "a big step towards more personalised and preventive approaches to healthcare". EMBL and DKFZ worked with the University of Copenhagen to build the model. The study authors said the AI tool is not yet ready to be used in doctors' offices, but that it can already help researchers understand how diseases develop over time and how people's lifestyles and medical histories affect their health risks. However, independent experts noted that the model was trained and tested on UK and Danish datasets that are biased in terms of age, ethnicity, and health outcomes. More work would be needed to ensure the model can accurately predict disease risk for more diverse groups of people. The researchers said that in the future, the tool could help physicians identify high-risk patients before they fall ill. Gerstung said the tool could "eventually support earlier, more tailored interventions".
[12]
Researchers detail Delphi-2M, an AI model capable of predicting over 1,000 diseases - SiliconANGLE
Researchers detail Delphi-2M, an AI model capable of predicting over 1,000 diseases Researchers in Europe unveiled a new artificial intelligence model Wednesday that can forecast a person's risk of developing over 1,000 diseases more than a decade in advance. The generative AI tool, called Delphi-2M after the ancient Greek oracle, was custom-built by scientists from the European Molecular Biology Laboratory, the German Cancer Research Centre and the University of Copenhagen. Outlined in Nature paper, the new model was trained solely on a dataset from the United Kingdom, which included nearly half a million individuals, and was validated using data from approximately 2 million participants in Denmark. It functions by analyzing a person's history of "medical events" to evaluate the likelihood of developing specific illnesses. "Medical events often follow predictable patterns," said Tomas Fitzgerald, a staff scientist at EMBL's European Bioinformatics Institute. "Our AI model learns those patterns and can forecast future health outcomes." A wide range of prediction algorithms and AI models already exist for specific diseases, for example, cardiovascular disease and cancer. They use broad population data that includes genetics, lifestyle, socioeconomic status, region and more to predict the probability that a particular illness will arise in a person or population. Researchers have noted that very few models can predict a broad spectrum of human illnesses. To address this challenge, Delphi-2M was developed. Predicting disease progression is fundamental for preventative healthcare and serves as an anchor for serving vulnerable populations and patients, especially those who are aging. For example, the probability of the aforementioned diseases of the heart and cancer increases as people age. The "Health inequalities 2040" report from The Health Foundation, in the UK, projected that the number of working-age people with major illnesses such as depression, asthma, heart disease and dementia will increase from 3 million to 3.7 million by 2040, concentrated in more deprived areas. The researchers modified a type of large language model called a generative pre-trained transformer, the same kind that underpins the models running well-known chatbots such as OpenAI's ChatGPT. Transformers ingest vast amounts of data to generate predictive outputs based on conditions presented to them. In the case of a chatbot, that would be a question or a statement; for Delphi-2M, that's the past medical history of an individual. According to the researchers, the model also incorporates age, sex, body mass index and health-related habits, such as tobacco use and alcohol consumption. The key distinguishing feature between Delphi and basic GPT models is that the AI algorithm can calculate absolute rates, providing consistent estimates. From testing, Delphi-2Ms' predictions matched or exceeded the accuracy of current models for most diseases. It also proved to be superior to the Qrisk method, a prediction algorithm used to calculate the risk of having a heart attack or stroke over the next decade. The model works best on diseases that follow a predictable pattern of progression, such as certain types of cancer. It works by calculating the probability of a person developing a particular illness for up to 20 years.
[13]
New AI model could predict risk of disease a decade before symptoms
A new AI model could predict a patient's risk of more than 1,000 diseases a decade before they actually develop symptoms, according to new results. The scientists who created the tool hope it could be ready for GPs and other doctors to use within 5 to 10 years, giving patients an early warning that they need to change their lifestyles or perhaps start medical intervention. Ewan Birney, who led the team at the European Molecular Biology Laboratory in Cambridge, said it was "one of the most exciting bits of science I have been involved in - it's really cool." The AI tool, called Delphi, was trained on the anonymised medical records of 400,000 people who have signed up to the UK Biobank research database. It learned how their medical history changed over time, picking out patterns that were associated with later diseases. The tool was then unleashed on the records of 1.9 million patients in the Danish National Patient Registry and was able to make "meaningful" predictions of the risk and timing of over 1,000 diseases, according to results published in the journal Nature. Read more from Sky News: How most people are using ChatGPT NHS medicines bill 'should rise to preserve UK drug industry' "By modelling how illnesses develop over time, we can start to explore when certain risks emerge and how best to plan early interventions, said Mr Birney. "It's a big step towards more personalised and preventive approaches to healthcare." The tool in effect gives a 'health forecast', much like a weather app. Long-term predictions of a disease - or the chance of rain - are uncertain, though still useful. But short-term assessments of risk are more accurate. The research highlighted how the likelihood of disease can vary widely across the population. Delphi showed that the risk of a heart attack in men aged between 60 and 65 ranged from 1 in 100 per year in some to 1 in 2,500 per year in others. Women were less likely to have a heart attack on average, but there was a similar spread of risk. The tool would need to be carefully assessed before it was rolled out for clinical use. Moritz Gerstung, from the German Cancer Research Centre, which collaborated in the study, said some patients could become "fatalistic" after discovering they have a higher than average chance of a disease. "There is an element of psychology that needs to be brought into an assessment of how such tools are used in the future," he said. There is huge interest in the ability of AI to spot patterns in health data. Pharmaceutical company AstraZeneca last year published results from an AI model that looked at data from routine GP visits, such as blood pressure checks and urine tests, as well as 3,000 proteins found in the blood. It predicted the risk of 121 diseases with "exceptional" accuracy up to 20 years in advance, the researchers said.
[14]
Delphi-2M AI predicts 1000+ diseases using over 400k medical records
The model uses a large-scale medical dataset to forecast the likelihood of more than 1,000 conditions simultaneously. Researchers at the German Cancer Research Center have developed an artificial intelligence model, Delphi-2M, that can predict an individual's risk for more than 1,000 diseases up to two decades into the future using medical records. This development aligns with a broader shift in healthcare from reactive treatment to proactive prevention. While algorithms have been created to predict the risk of single conditions, diseases are often interconnected. A comprehensive model that can account for this complexity could inform early treatment, improve targeted screening, and identify high-risk individuals who might otherwise be overlooked. The Delphi-2M model is a large language model (LLM), similar to the technology behind text-generating chatbots. Instead of being trained on internet text, it was developed by processing over 400,000 comprehensive medical records from the UK Biobank. This clinical data was supplemented with lifestyle information, such as body mass index and smoking status. The model treats a patient's medical history as a sequence of "disease tokens," where each diagnostic code represents a step in a potential disease progression. By analyzing these sequences, the AI learns the statistical patterns of how different conditions connect and follow one another over time. A key feature is its ability to dynamically re-evaluate predictions. When new information, like a recent blood test result, is added, the model can update its risk calculations for that individual, allowing for continuous health monitoring. In performance evaluations, Delphi-2M matched or exceeded the accuracy of established clinical risk scores for the majority of the 1,258 diseases it was trained on. It also outperformed other specialized medical AI predictors designed to forecast single diseases. The model proved particularly effective in predicting the long-range risk of cardiovascular disease and dementia, showing greater accuracy than some biomarker-based models even when forecasting two decades into the future. However, the model struggled to accurately predict conditions with more variable trajectories heavily influenced by lifestyle changes, such as Type 2 diabetes. This indicates a limitation in its ability to account for factors not consistently captured in electronic health records. To test its robustness, the researchers applied the model to the Danish National Patient Registry, which contains records for nearly two million citizens. Despite differences in the populations and healthcare systems, the model's prediction accuracy remained high, suggesting it learned fundamental principles of human disease progression. Delphi-2M was designed with practical and ethical considerations in mind. It can learn from synthetic medical records to protect patient privacy and is an "explainable" AI, meaning it can provide a rationale for its predictions by clustering related conditions and symptoms. The researchers emphasize that the model identifies statistical associations, not causation. The model is built with a modular design to incorporate additional data types in the future, such as genomics, diagnostic imaging, and data from wearable devices. Currently, the tool is being tested in other countries with diverse populations. In its present form, it could be used in clinical settings to identify individuals who would benefit from early screening, even if they do not meet traditional criteria. The model has been positively received by experts not involved in the study. Justin Stebbing, a professor at Anglia Ruskin University, called the tool "an achievement" that sets "a new standard for both predictive accuracy and interpretability." Gustavo Sudre, a researcher at King's College London, described the research as:
Share
Share
Copy Link
Researchers have developed an AI tool called Delphi-2M that can forecast a person's risk of developing more than 1,000 diseases up to 20 years in advance. The model uses health records and lifestyle factors to estimate disease likelihood, potentially revolutionizing preventive healthcare.
Researchers have developed a groundbreaking artificial intelligence (AI) tool called Delphi-2M that can forecast a person's risk of developing more than 1,000 diseases up to 20 years in advance
1
2
. This innovative model, described in a study published in Nature, uses health records and lifestyle factors to estimate the likelihood of a person developing various conditions, including cancer, skin diseases, and immune disorders1
.Source: News-Medical
Delphi-2M is based on a modified version of a large language model (LLM) called a generative pre-trained transformer (GPT), similar to those used in AI chatbots like ChatGPT
1
. The model was trained on data from 400,000 participants in the UK Biobank, a long-term biomedical monitoring study2
.The AI tool incorporates various factors in its predictions, including:
Source: Financial Times News
For most diseases, Delphi-2M's predictions matched or exceeded the accuracy of current models that estimate the risk of developing a single illness
1
. The tool performed particularly well when forecasting conditions with predictable progression patterns, such as certain types of cancer2
.To test its generalizability, researchers applied Delphi-2M to health data from 1.9 million people in the Danish National Patient Registry
1
. The model's predictions for the Danish population were only slightly less accurate than for the UK Biobank participants, demonstrating its potential for use across different national health systems2
.Source: BBC
The multi-disease modeling capabilities of Delphi-2M could revolutionize preventive healthcare and resource planning
3
. Some potential applications include:Related Stories
While promising, Delphi-2M has some limitations. For instance, it only captured participants' first encounter with a disease in the UK Biobank data, which may affect the modeling of personal health trajectories
2
. Additionally, the model is less effective at predicting diseases with unpredictable external causes and very rare congenital conditions3
.Researchers are now working to extend Delphi-2M by incorporating biological data about individuals' genes and proteins
3
. They also plan to evaluate the model's accuracy on datasets from several countries to expand its scope2
.As Delphi-2M moves closer to potential clinical use, ethical considerations and responsible implementation will be crucial
5
. The model's developers emphasize the importance of using this technology to enhance personalized care and anticipate healthcare needs at scale, while also ensuring patient privacy and data protection3
.With further development and validation, Delphi-2M could transform healthcare planning and delivery, ushering in a new era of precision medicine and proactive health management
5
.Summarized by
Navi
[2]
[3]