6 Sources
[1]
Microsoft Says Its New AI System Diagnosed Patients 4 Times More Accurately Than Human Doctors
The tech giant poached several top Google researchers to help build a powerful AI tool that can diagnose patients and potentially cut health care costs. Microsoft has taken "a genuine step towards medical superintelligence," says Mustafa Suleyman, CEO of the company's artificial intelligence arm. The tech giant says its powerful new AI tool can diagnose disease four times more accurately and at significantly less cost than a panel of human physicians. The experiment tested whether the tool could correctly diagnose a patient with an ailment, mimicking work typically done by a human doctor. The Microsoft team used 304 case studies sourced from the New England Journal of Medicine to devise a test called the Sequential Diagnosis Benchmark (SDBench). A language model broke down each case into a step-by-step process that a doctor would perform in order to reach a diagnosis. Microsoft's researchers then built a system called the MAI Diagnostic Orchestrator (MAI-DxO) that queries several leading AI models -- including OpenAI's GPT, Google's Gemini, Anthropic's Claude, Meta's Llama, and xAI's Grok -- in a way that loosely mimics several human experts working together. In their experiment, MAI-DxO outperformed human doctors, achieving an accuracy of 80 percent compared to the doctors' 20 percent. It also reduced costs by 20 percent by selecting less expensive tests and procedures. "This orchestration mechanism -- multiple agents that work together in this chain-of-debate style -- that's what's going to drive us closer to medical superintelligence," Suleyman says. The company poached several Google AI researchers to help with the effort -- yet another sign of an intensifying war for top AI expertise in the tech industry. Suleyman was previously an executive at Google working on AI. AI is already widely used in some parts of the US health care industry, including helping radiologists interpret scans. The latest multimodal AI models have the potential to act as more general diagnostic tools, though the use of AI in health care raises its own issues, particularly related to bias from training data that's skewed toward particular demographics. Microsoft has not yet decided if it will try to commercialize the technology, but the same executive, who spoke on the condition of anonymity, said the company could integrate it into Bing to help users diagnose ailments. The company could also develop tools to help medical experts improve or even automate patient care. "What you'll see over the next couple of years is us doing more and more work proving these systems out in the real world," Suleyman says. The project is the latest in a growing body of research showing how AI models can diagnose disease. In the last few years, both Microsoft and Google have published papers showing that large language models can accurately diagnose an ailment when given access to medical records. The new Microsoft research differs from previous work in that it more accurately replicates the way human physicians diagnose disease -- by analyzing symptoms, ordering tests, and performing further analysis until a diagnosis is reached. Microsoft describes the way that it combined several frontier AI models as "a path to medical superintelligence," in a blog post about the project today. The project also suggests that AI could help lower health care costs, a critical issue, particularly in the US. "Our model performs incredibly well, both getting to the diagnosis and getting to that diagnosis very cost effectively," says Dominic King, a vice president at Microsoft who is involved with the project.
[2]
Microsoft unveils AI diagnosis tool in effort to transform medicine
Microsoft has built an artificial intelligence-powered medical tool it claims is four times more successful than human doctors at diagnosing complex ailments, as the tech giant unveils research it believes could speed up treatment and save money by reducing unnecessary tests. The "Microsoft AI Diagnostic Orchestrator" is the first initiative to come out of an AI health unit formed last year by Mustafa Suleyman with staff poached from DeepMind, the research lab he co-founded and which is now owned by rival Google. In an interview with the Financial Times, the chief executive of Microsoft AI said the trial was a step on the path to "medical superintelligence" that could help solve staffing crises and long waiting times for overstretched health systems. A version of the technology could soon also be deployed in Microsoft's Copilot AI chatbot and Bing search engine, which handle 50mn health queries a day. We are nearing "AI models that are not just a little bit better, but dramatically better, than human performance: faster, cheaper and four times more accurate," said Suleyman. "That is going to be truly transformative." Suleyman's new effort comes after Deepmind has led the way on AI-related heathcare breakthroughs. The Google lab's chief Sir Demis Hassabis jointly won a chemistry Nobel Prize last year for using AI to unlock the biological secrets of proteins that underpin life. Microsoft's new system is underpinned by a so-called "orchestrator" that creates virtual panels of five AI agents acting as "doctors" -- each with a distinct role, such as coming up with hypotheses or choosing diagnostic tests -- which interact and "debate" together to choose a course of action. To test its capabilities, "MAI-DxO" was fed 304 studies from the New England Journal of Medicine (NEJM) that describe how some of the most complicated cases were solved. This allowed researchers to test if the programme could figure out the correct diagnosis and monitor the decisions it made, using a new technique called "chain of debate," that makes AI reasoning models show how they solve problems step-by-step. Microsoft used leading large language models from OpenAI, Meta, Anthropic, Google, xAI and DeepSeek. The orchestrator made all LLMs perform better, but worked best with OpenAI's o3 reasoning model to correctly solve 85.5 per cent of the NEJM cases. That compared with about 20 per cent by experienced human doctors, but those physicians were not allowed access to textbooks or to ask colleagues in the trial, which could have increased their success rate. Microsoft has invested almost $14bn into OpenAI and has exclusive rights to use and sell its technology. However, the tech giant is embroiled in high-stakes brinkmanship with the start-up, which is attempting to convert into a for-profit entity, with both sides clashing over the future terms of their partnership. Suleyman said that while OpenAI's model performed the best, Microsoft was "agnostic" over which of the four "world-class models" MAI-DxO used. "We have long believed that they'll become commodities . . . it's the aggregate orchestrator which I think is the differentiator," he said. Dominic King, the former head of DeepMind's health unit who joined Microsoft late last year, said that the programme had "performed better than anything we've ever seen before" and that "there is an opportunity here today to act almost as a new front door to healthcare". The AI models were also prompted to be cost-conscious, which significantly cut the number of tests required to get to a correct diagnosis in the trial, saving hundreds of thousands of dollars in some cases, he said. However, King stressed that the technology was still in its early stages, had not been peer reviewed and was not yet ready for a clinical environment. "This is a landmark study," said Eric Topol, a cardiologist and founder and director of the Scripps Research Translational Institute. "While this work was not done in the setting of real world medical practice, it is the first to provide evidence for the efficiency potential of generative AI in medicine -- accuracy and cost savings."
[3]
AI vs. MDs: Microsoft AI tool outperforms doctors in diagnosing complex medical cases
Microsoft today announced an artificial intelligence tool that outperformed a panel of medical doctors in diagnosing complicated cases. The Microsoft AI Diagnostic Orchestrator (MAI-DxO) faced off against 21 experienced physicians from the U.S. and United Kingdom presented with complex cases documented in the New England Journal of Medicine. MAI-DxO gave a correct diagnosis for 85.5% of the test cases while the doctors hit the mark 20% of the time. Microsoft used the tool with well-known AI models including GPT, Llama, Claude, Gemini, Grok and DeepSeek. The best setup was MAI-DxO paired with OpenAI's o3. Like a human physician, MAI-DxO diagnoses by analyzing symptoms, posing questions, and recommending medical tests. A key feature is its ability to optimize costs, preventing the ordering of superfluous diagnostics that contribute to overspending in healthcare. While MAI-DxO outperformed the medical providers, the tech company acknowledged that under normal conditions the doctors would not be operating in isolation and would be able to consult colleagues and online and print resources. The new diagnostic performance benchmark was created from 304 recent cases documented by the New England Journal of Medicine. The tool builds on earlier tests of AI performance in medicine that quizzed the bots on the U.S. Medical Licensing Examination (USMLE) standardized test. Microsoft said AI tools have evolved to get nearly perfect scores on the test, but their multiple-choice structure favors "memorization over deep understanding." The new benchmark using complicated cases requires higher-order skills to perform "sequential diagnosis, a cornerstone of real-world medical decision making," according to Microsoft's blog post. Next steps for developing the tool for public use include testing its abilities against more commonplace ailments. Before it could be deployed in healthcare practices, it would require testing in a clinical setting for safety and performance and approval from regulators. As it has with other AI applications, Microsoft emphasized that the bots are not meant to replace people but to optimize their output, in this case automating routine tasks, assisting in diagnosis and creating personalized care strategies.
[4]
Microsoft says AI system better than doctors at diagnosing complex health conditions
Firm says results of research create 'path to medical superintelligence' but plays down job implications Microsoft has revealed details of an artificial intelligence system that performs better than human doctors at complex health diagnoses, creating a "path to medical superintelligence". The company's AI unit, which is led by the British tech pioneer Mustafa Suleyman, has developed a system that imitates a panel of expert physicians tackling "diagnostically complex and intellectually demanding" cases. Microsoft said that when paired with OpenAI's advanced o3 AI model, its approach "solved" more than eight of 10 case studies specially chosen for its research. When those case studies were tried on practising physicians - who had no access to colleagues, textbooks or chatbots - the accuracy rate was two out of 10. Microsoft said it was also a cheaper option than using human doctors because it was more efficient at ordering tests. Despite highlighting the potential cost savings from its research, Microsoft played down the job implications, saying it believed AI would complement doctors' roles rather than replace them. "Their clinical roles are much broader than simply making a diagnosis. They need to navigate ambiguity and build trust with patients and their families in a way that AI isn't set up to do," the company wrote in a blogpost announcing the research, which is being submitted for peer review. However, using the slogan "path to medical superintelligence" raises the prospect of radical change in the healthcare market. While artificial general intelligence (AGI) refers to systems that match human cognitive abilities at any given task, superintelligence is an equally theoretical term referring to a system that exceeds human intellectual performance across the board. Explaining the rationale behind the research, Microsoft raised doubt over AI's ability to score exceptionally well in the United States Medical Licensing Examination, a key test for obtaining a medical licence in the US. It said the multiple-choice tests favoured memorising answers over deep understanding of a subject, which could help "overstate" the competence of an AI model. Microsoft said it was developing a system that, like a real-world clinician, takes step-by-step measures - such as asking specific questions and requesting diagnostic tests - to arrive at a final diagnosis. For instance, a patient with symptoms of a cough and fever may require blood tests and a chest X-ray before the doctor arrives at a diagnosis of pneumonia. The new Microsoft approach uses complex case studies from the New England Journal of Medicine (NEJM). Suleyman's team transformed more than 300 of these studies into "interactive case challenges" that it used to test its approach. Microsoft's approach used existing AI models, including those produced by ChatGPT's developer, OpenAI, Mark Zuckerberg's Meta, Anthropic, Elon Musk's Grok and Google's Gemini. Microsoft then used a bespoke, agent-like AI system called a "diagnostic orchestrator" to work with a given model on what tests to order and what the diagnosis might be. The orchestrator in effect imitates a panel of physicians, which then comes up with the diagnosis. Microsoft said that when paired with OpenAI's advanced o3 model, it "solved" more than eight of 10 NEJM case studies - compared with a two out of 10 success rate for human doctors. Microsoft said its approach was able to wield a "breadth and depth of expertise" that went beyond individual physicians because it could span multiple medical disciplines. It added: "Scaling this level of reasoning - and beyond - has the potential to reshape healthcare. AI could empower patients to self-manage routine aspects of care and equip clinicians with advanced decision support for complex cases." Microsoft acknowledged its work is not ready for clinical use. Further testing is needed on its "orchestrator" to assess its performance on more common symptoms, for instance.
[5]
AI doctor four times better at identifying illnesses than humans
Microsoft has developed an artificial intelligence (AI) system that it claims is four times better than doctors at diagnosing complex illnesses. The tech company's AI diagnosis system was able to correctly identify ailments up to 86pc of the time, compared to just 20pc on average for British and American physicians. Announcing the findings, Microsoft claimed it had laid the groundwork for "medical superintelligence". It comes as Wes Streeting, the Health Secretary, is seeking to bring AI into widespread use in the NHS to improve efficiency. In April the NHS waiting list rose for the first time in seven months, reaching 7.42m in a blow to one of the Government's key pledges to cut waiting times. Microsoft claimed its system could solve problems more cheaply than doctors - beating physicians even when sticking to a budget for diagnostic tests. The system, known as Microsoft AI Diagnostic Orchestrator, or MAI-DxO, was tested on 304 cases from the New England Journal of Medicine, a medical journal known for publishing complex medical cases from Massachusetts General Hospital. The system comprised a virtual panel of five different AI bots, each serving different roles such as "Dr Hypothesiser", "Dr Test-Chooser" and "Dr Challenger" that would internally deliberate before asking further questions or ordering tests and providing a diagnosis. In one case, the system diagnosed embryonal rhabdomyosarcoma, a rare form of cancer that normally occurs in children, in a 29-year-old woman.
[6]
Microsoft Says AI Tool Outperforms Physicians on Complex Medical Cases | PYMNTS.com
By completing this form, you agree to receive marketing communications from PYMNTS and to the sharing of your information with our sponsor, if applicable, in accordance with our Privacy Policy and Terms and Conditions. MAI-DxO also achieved correct diagnoses more cost effectively than physicians, the company said in a Monday blog post. "For AI to make a difference, clinicians and patients alike must be able to trust its performance," the post said. "That's where our new benchmarks and AI orchestrator come in." Earlier benchmarks used to evaluate AI systems in medicine were based on the United States Medical Licensing Examination (USMLE), which is based on multiple-choice questions, favors memorization and therefore overstates the apparent competence of AI systems, according to the post. To overcome the limitations of that test, Microsoft AI developed a new one that requires sequential diagnosis and uses 304 recent cases published by NEJM, the post said. This test requires AI models and human physicians to ask questions, order tests and work toward a final diagnosis. Microsoft AI's test also includes a virtual cost that reflects real-world healthcare expenditures, per the post. The MAI-DxO, being an orchestrator, accesses multiple language models and integrates diverse data sources, according to the report. It is also configurable so that it can be told to operate within defined cost constraints. "Together with our partners, we strongly believe that the future of healthcare will be shaped by augmenting human expertise and empathy with the power of machine intelligence," the post said. "We are excited to take the next steps in making that vision a reality." The PYMNTS Intelligence and AI-ID collaboration "Generative AI Tracker®: Generative AI Can Elevate Health and Revolutionize Healthcare" found that while Americans are enthusiastic about the potential benefits of AI in healthcare, they still feel uncomfortable with the idea of healthcare providers relying on AI or replacing their medical professionals with this technology. Sixty percent of Americans said they are uncomfortable with a provider relying on AI in their healthcare, while 57% believe using AI to diagnose diseases and suggest treatments would harm the patient-provider relationship, according to the report.
Share
Copy Link
Microsoft unveils an AI-powered diagnostic system that demonstrates superior accuracy and cost-effectiveness compared to human physicians in diagnosing complex medical conditions.
Microsoft has announced a revolutionary artificial intelligence (AI) system that outperforms human doctors in diagnosing complex medical conditions. The Microsoft AI Diagnostic Orchestrator (MAI-DxO) has demonstrated an impressive 85.5% accuracy rate in solving complicated medical cases, compared to a 20% success rate for experienced physicians 123.
Source: The Telegraph
The MAI-DxO system employs a unique "orchestrator" approach, creating virtual panels of five AI agents acting as "doctors" 2. Each agent has a distinct role, such as generating hypotheses or selecting diagnostic tests. These AI doctors interact and "debate" to determine the best course of action, mimicking the collaborative process of human medical experts 12.
Microsoft's research team, led by Mustafa Suleyman, CEO of the company's AI arm, utilized several leading AI models in their experiment. These included OpenAI's GPT, Google's Gemini, Anthropic's Claude, Meta's Llama, and xAI's Grok 14. The system performed best when paired with OpenAI's o3 reasoning model 23.
To evaluate the AI system's capabilities, researchers developed the Sequential Diagnosis Benchmark (SDBench) using 304 case studies from the New England Journal of Medicine 13. These cases were transformed into interactive challenges that tested the AI's ability to perform sequential diagnosis, a crucial aspect of real-world medical decision-making 4.
The MAI-DxO significantly outperformed human doctors in the trial, achieving an accuracy rate of 80-85.5% compared to the physicians' 20% 123. However, it's important to note that the human doctors in the experiment were not allowed access to textbooks or consultations with colleagues, which could have improved their performance 24.
One of the most promising aspects of the MAI-DxO system is its potential for cost reduction in healthcare. The AI was programmed to be cost-conscious, resulting in a significant decrease in the number of tests required for accurate diagnosis 2. In some cases, this approach led to savings of hundreds of thousands of dollars 2.
While the results are impressive, Microsoft acknowledges that the technology is still in its early stages and not yet ready for clinical use 24. Further testing is needed, particularly on more common ailments and in real-world clinical settings 45.
Source: GeekWire
The company emphasizes that AI tools like MAI-DxO are not intended to replace human doctors but to optimize their output by automating routine tasks, assisting in diagnosis, and creating personalized care strategies 34. However, the use of terms like "path to medical superintelligence" suggests the potential for significant changes in healthcare delivery 4.
Source: Financial Times News
This development comes amid an intensifying competition for AI talent in the tech industry. Microsoft's AI health unit, formed last year, includes staff poached from DeepMind, the Google-owned research lab co-founded by Suleyman 2.
The project is part of a growing body of research demonstrating AI's potential in medical diagnosis. Both Microsoft and Google have previously published papers showing large language models' ability to accurately diagnose ailments when given access to medical records 1.
As AI continues to advance in the medical field, it raises important questions about the future role of human physicians, the potential for bias in AI training data, and the need for careful regulation and ethical considerations in healthcare AI applications 145.
Apple is reportedly exploring the possibility of using AI models from Anthropic or OpenAI to power a new version of Siri, potentially sidelining its in-house technology in a major strategic shift.
11 Sources
Technology
2 hrs ago
11 Sources
Technology
2 hrs ago
Baidu, China's tech giant, is set to open-source its Ernie AI model, potentially disrupting the global AI market and intensifying competition with Western rivals like OpenAI and Anthropic.
4 Sources
Technology
18 hrs ago
4 Sources
Technology
18 hrs ago
Google announces a major expansion of AI tools in education, including Gemini for Education and NotebookLM for under-18 users, aiming to transform classroom experiences while addressing concerns about AI in learning environments.
7 Sources
Technology
2 hrs ago
7 Sources
Technology
2 hrs ago
Apple is reportedly developing seven new extended reality (XR) devices, including upgraded Vision Pro headsets and smart glasses, set to launch from 2027 onwards, signaling a major push into the wearable tech market.
10 Sources
Technology
18 hrs ago
10 Sources
Technology
18 hrs ago
A mysterious new band called The Velvet Sundown has gained over 500,000 monthly listeners on Spotify, but evidence suggests it's entirely AI-generated, raising questions about transparency and the impact on human musicians.
7 Sources
Technology
10 hrs ago
7 Sources
Technology
10 hrs ago