Google releases MedGemma 1.5 with advanced imaging and MedASR for medical speech-to-text

Reviewed byNidhi Govil

3 Sources

Share

Google has launched MedGemma 1.5, an updated healthcare AI model with support for CT scans, MRI, and histopathology, alongside MedASR, a medical speech-to-text system. The open-source AI models are available via Hugging Face and Vertex AI, marking Google's community-focused approach to healthcare AI as competition intensifies with OpenAI and Anthropic.

Google Expands Healthcare AI with Two Open-Source Models

Google has released MedGemma 1.5 and MedASR, two open-source AI models designed to transform how medical professionals process clinical data. The updated MedGemma 1.5 expands support for CT scans and MRI, while MedASR tackles clinical dictation with specialized medical speech-to-text capabilities. Both models are available through Hugging Face and Vertex AI, part of Google's Health AI Developer Foundations programme for research and commercial use

1

.

Source: Digit

Source: Digit

The release marks a notable strategic difference in the healthcare AI landscape. While OpenAI offers ChatGPT Health as a commercial enterprise product and Anthropic has rolled out Claude for Healthcare, Google has taken an open-source approach. This community-focused strategy allows developers to fine-tune and adapt the models for specific clinical workflows, lowering barriers to entry for building next-generation medical applications

2

.

MedGemma 1.5 Advances Medical Imaging Analysis

MedGemma 1.5 represents a significant upgrade from its predecessor, introducing high-dimensional imaging capabilities that extend beyond standard 2D chest X-rays. The 4B-parameter vision-language model now interprets three-dimensional volume representations, including CT scans, MRI, and whole-slide histopathology imaging. This multimodal model can analyze multiple slices or patches to understand the full context of a scan, rather than examining isolated images

3

.

Source: Gadgets 360

Source: Gadgets 360

The performance improvements are substantial. Google reported that MedGemma 1.5 achieved a 14% accuracy gain on MRI-based disease classification compared to the previous version, and a 3% improvement on CT scan classification tasks. The model also demonstrates enhanced capabilities in anatomical localization in chest X-rays and structured data extraction from laboratory reports

1

.

Beyond static imaging, MedGemma 1.5 excels at longitudinal analysis, tracking changes in chest X-ray time series with greater accuracy. The model achieved a 90% score on the EHRQA benchmark for Electronic Health Record Question Answering, representing a 22% jump over the original MedGemma. This ensures the model handles complex lab reports and medical records as effectively as it processes medical imaging analysis

3

.

MedASR Reduces Errors in Clinical Dictation

MedASR addresses a persistent challenge in healthcare documentation: accurately transcribing clinical conversations filled with complex medical terminology. This automated speech recognition model is fine-tuned specifically for medical dictation, designed to handle accents and real-world clinical audio conditions that often trip up general-purpose systems

2

.

Source: AIM

Source: AIM

The performance difference is striking. In internal tests, MedASR recorded a 5.2% word error rate on chest X-ray dictations, compared to 12.5% for general-purpose speech recognition models. This translates to 58% fewer errors on chest X-ray dictations and 82% fewer errors on diverse medical dictation benchmarks when compared to models like Whisper (large-v3)

3

.

Google emphasized that verbal communication remains crucial in healthcare, including medical dictation and live conversations between patients and providers. MedASR is designed to transcribe doctor-patient interactions, clinical notes, and dictated reports, with adaptability across different healthcare environments. The model can be fine-tuned for specific clinical workflows or documentation standards, making it practical for diverse medical settings

1

.

Deployment Options and Developer Support

Both models can be accessed via Google Cloud through Vertex AI, as well as through Hugging Face listings. The 4B-parameter version of MedGemma 1.5 is designed to be compute-efficient and capable of running offline, while a larger 27B-parameter version remains available for text-heavy medical applications. Google's MedGemma GitHub repository provides tutorials for developers looking to integrate these tools

2

.

To accelerate adoption, Google announced the MedGemma Impact Challenge, a hackathon hosted on Kaggle with $100,000 in prizes. The competition encourages developers to build healthcare applications using the models, potentially creating seamless multimodal workflows where clinicians dictate queries processed by MedASR regarding specific CT volumes analyzed by MedGemma 1.5

1

.

Google emphasized that MedGemma is intended as a developer foundation model and should be validated and adapted before use in clinical settings. The permissive license allows both research and commercial use cases, enabling healthcare organizations to customize the models for their specific needs. As healthcare AI adoption accelerates at twice the rate of the broader economy, these tools provide developers with the foundation to build applications that can see, listen, and reason more like clinicians

3

.

Today's Top Stories

TheOutpost.ai

Your Daily Dose of Curated AI News

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

© 2026 Triveous Technologies Private Limited
Instagram logo
LinkedIn logo