Large-scale clinical trial finds generative AI improves care quality but not patient outcomes

2 Sources

Share

A randomized controlled trial involving over 9,600 patients across 16 primary care clinics in Kenya tested whether generative AI can improve patient-level outcomes in real-world settings. The AI Consult tool improved clinical documentation and decision-making quality while reducing antibiotic costs, but showed no significant impact on short-term treatment failure rates. The findings raise important questions about measuring AI's value in primary care.

Generative AI Clinical Trial Tests Real-World Impact in Primary Care

A groundbreaking randomized controlled trial published in Nature Medicine has evaluated whether generative AI can deliver measurable benefits to patients in primary care settings, moving beyond simulated cases to test real-world effectiveness

1

2

. The study involved more than 9,600 patients attending 16 primary care clinics in Kenya, making it one of the first large-scale trials to rigorously examine whether AI in healthcare actually improves patient outcomes rather than just clinician performance

2

.

Source: News-Medical

Source: News-Medical

Clinicians were randomly assigned to use an electronic medical record system either with or without AI Consult, an AI-powered clinical decision support system embedded directly into their workflow

2

. The generative AI support tool analyzed information entered during consultations, generated context-specific diagnostic and treatment suggestions aligned with Kenyan national clinical guidelines, and flagged potential concerns using a color-coded alert system

2

. Critically, clinicians retained complete autonomy to accept, modify, or disregard the system's recommendations, with the AI interface remaining invisible to patients

1

2

.

AI-Supported Care Shows No Significant Change in Patient Outcomes

The clinical trial found similar rates of 14-day treatment failure between groups, with 2.2% in the AI-supported care group versus 2.0% in standard care

2

. This corresponded to between 13 fewer and 1 additional treatment failures per 1,000 patients, suggesting any true effect is likely modest

1

. The study found no evidence of harm, with similar rates of hospitalization and death in both groups

2

.

Professor Bilal Mateen, Senior Author and Honorary Professor of Machine Learning for Health at the University of Birmingham, noted: "This is one of the first studies to rigorously ask the hardest question about AI in healthcare: whether it actually improves outcomes for patients. What we found is reassuring but also sobering. The technology appears safe and clearly improves aspects of clinical decision-making, but translating those gains into measurable patient benefit is much more challenging"

2

.

Improved Clinical Documentation and Diagnostic Reasoning

While patient outcomes remained unchanged, the intervention significantly improved the quality of clinical documentation and treatment planning, as assessed by an independent panel of experienced clinicians who were blinded to whether AI had been used

2

. The trial demonstrated enhanced diagnostic reasoning and appropriate treatment planning, alongside reduced antibiotic-related costs due to more cost-conscious prescribing choices

1

2

. Notably, the intervention did not change overall antibiotic prescribing rates among febrile patients, possibly reflecting how deeply ingrained certain prescribing practices are in clinical workflows

1

.

Patient satisfaction remained identical in both groups, indicating that the generative AI support tool did not alter patients' experience of care or undermine the patient-clinician relationship

2

.

Workflow Integration and Clinician Autonomy Preserved

The AI-powered clinical decision support system was implemented as a workflow-integrated tool that generated recommendations automatically during routine documentation, without requiring clinicians to actively initiate its use

1

. This design reflects real-world implementation conditions rather than enforced use scenarios, supporting the external validity of the findings

1

. Professor Alastair Denniston, co-author and Professor of Regulatory Science and Innovation at the University of Birmingham, emphasized: "What this study shows is that AI can be integrated safely into real clinical workflows, without undermining patient trust or clinician autonomy - which is a critical foundation for any future impact"

2

.

Challenges in Measuring AI's Value in Primary Care

The trial highlights a fundamental challenge in evaluating general-purpose AI technologies in primary care: serious outcomes such as hospitalization or death are rare, meaning extremely large studies involving potentially more than 100,000 patients would be needed to detect modest effects

1

2

. The observed event rate was lower than anticipated, resulting in limited precision for detecting modest effects, though large clinically meaningful effects are unlikely based on the bounded inference

1

.

The study was funded by the Gates Foundation and conducted with collaborators from the London School, with findings that researchers emphasize have global relevance beyond Kenya

2

. Professor Richard Riley, Professor of Biostatistics at the University of Birmingham, stated: "Robust trials like this are so important to establish the real impact of using AI in practice. They help set realistic expectations of what AI can actually contribute within existing care pathways"

2

.

Today's Top Stories

© 2026 TheOutpost.AI All rights reserved