A hybrid documentation model that combines generative artificial intelligence (AI) and a virtual human scribe was associated with fewer after-work hours, faster patient note completion, and better productivity for clinicians within 50 days of the program's start than those who used no documentation assistance, new data suggest.
The retrospective study included 181 primary care physicians and advanced practice providers, collectively caring for 108,224 patients in 14 adult primary care practices in the Massachusetts General Hospital health system in Boston.
Sixty-six of those clinicians (36%) voluntarily enrolled in a hybrid program in which ambient documentation technology recorded the clinician's notes and produced text, and a virtual human scribe entered the final note into the electronic health record (EHR) for the clinician to review. Participants received standardized training and technical support. The remaining 115 used no documentation assistance.
Total After-Hours Minutes Were Among Metrics Measured
Researchers collected data on the total minutes of "work after work (WOW)" a provider spent each day on the EHR outside of standard work hours, documentation delays (the proportion of visit notes that were not finished 2 days after the visit), and measures of clinical financial productivity over 80 days.
The research team evaluated the program's effects at 50 days post-implementation and found a sustained 41% reduction in WOW and a 66% reduction in delayed note completion. Clinician financial productivity, as measured by work relative value units, increased by 12%, which suggests measurable value for health systems, wrote the authors, led by Lidia Moura, MD, PhD, MPH, with the Department of Neurology at Massachusetts General Hospital. Findings were published in the Journal of General Internal Medicine.
Clinicians were eligible if they had active continuity clinical panels and at least 6 months of EHR activity before the intervention start date. "Because some clinicians practiced in multiple clinics, analyses accounted for clustering at the provider-group level," the authors wrote.
Analysis Includes Answering Patients' Messages
The authors wrote that this study, using real-world longitudinal data, "is one of the largest and most rigorous evaluations of an ACD [ambient clinical documentation] tool's impact on clinician documentation burden and productivity. Notably, WOW captures all after-hours EHR activity -- including in-basket work, lab review, and patient messaging -- so it reflects broader workload beyond documentation alone," they wrote.
Previous studies typically assessed fully AI-based options that "may not integrate cleanly," the authors wrote. "We assessed ScribbleSwift," a hybrid model. "Clinicians reported less charting and cognitive load. Both AI-only and hybrid approaches entail trade-offs in accuracy, integration, burden, and cost," they noted.
Limitations included that participation was voluntary, which raises the possibility of selection bias, and that it was done in a single health system.
Jeffrey A. Gold, MD, a pulmonologist and professor of medicine at Oregon Health & Science University (OHSU) in Portland, Oregon, told Medscape Medical News this hybrid model stands out for a few reasons.
Among them is that other models currently being tested seem to be moving toward a full-AI scribe (with ambient speech recognition) and the provider reviewing the output, he said.
Gold, who is associate chief health information officer for advanced clinic training, learning health systems, and data stewardship at OHSU, said a likely contributing factor is how prevalently the providers use the model. The authors included in the methodology section that "over 80% of [hybrid] participants [were] active by day 30."
"With typical AI scribes, the data suggest even high adopters of the technology still don't use it for every encounter," he said. "If it was used for almost all encounters and the virtual scribe is trusted to edit the note, this may explain the results."
Trust Drives Proofreading Time
He noted that with all documentation assistance, trust is an issue that drives how muchtime the attending physician spends proofreading the note.
"One reason for the highly positive results may be there is more trust in the virtual human scribe acting as a proofreader than just using the AI itself," Gold said.
The authors pointed out that right after adoption of the model, the after-hours EHR time actually jumped (an increase of 32.1%) instead of decreasing. That may demonstrate the need for "temporary schedule adjustments, targeted training, and peer-champion support to accelerate adoption," the authors wrote.
Gold said it makes sense that initially the WOW hours jumped because clinicians had to learn to trust both the AI output and the virtual human scribe and were probably spending more time proofreading.
"But over time, with trust, this goes away," Gold said.
This study was institutionally sponsored by the Massachusetts General Hospital Center for the Transformation of Internal Medicine. Authors and Gold reported no relevant financial relationships.
Marcia Frellick is a Chicago-based, independent healthcare journalist.