AI Medical Advice Skewed by Typos, Slang, and Gender Bias, MIT Study Reveals

AI Models Exhibit Bias in Medical Recommendations

A recent study conducted by researchers at the Massachusetts Institute of Technology (MIT) has uncovered significant biases in artificial intelligence (AI) models used for medical advice. The research, presented at an Association for Computing Machinery conference in Athens, Greece, reveals that AI recommendations can be substantially influenced by factors such as typos, slang, and even the gender of the patient 1

Impact of Text Alterations on AI Recommendations

Source: New Scientist

The study, led by graduate student Abinitha Gourabathina from MIT's Department of Electrical Engineering and Computer Science, found that when AI models were presented with patient notes containing typos, extra spaces, or colloquial language, they were 7% to 9% more likely to recommend self-care rather than seeking medical attention 2

. This shift in recommendations occurred despite the preservation of all relevant clinical data in the patient records.

Gender Bias in AI Medical Advice

Perhaps more concerning is the discovery of gender bias in AI recommendations. The study found that AI models made about 7% more errors for female patients and were more likely to advise women to self-manage their conditions at home. This bias persisted even when researchers removed explicit gender cues from the patient records 3

Methodology and AI Models Tested

Researchers created thousands of simulated patient notes using AI, incorporating various writing styles and formats. These notes were then fed into four large language models (LLMs) commonly used in chatbots, including OpenAI's GPT-4, Meta's Llama-3-70b and Llama-3-8b, and the healthcare-specific Palmyra-Med model developed by AI company Writer 1

Implications for Healthcare AI

Karandeep Singh from the University of California San Diego, who was not involved in the study, emphasized the potential consequences of these biases: "Insidious bias can shift the tenor and content of AI advice, and that can lead to subtle but important differences in the direction of the conversation that could lead to disparities in the allocation of resources" 1

Future Research and Recommendations

The MIT team plans to continue their research by testing records that more closely mimic real patient messages and studying how AI programs infer gender from clinical tests 2

. The findings underscore the need for improved evaluation and monitoring of generative AI models in healthcare, as highlighted by Singh 1

As AI continues to play an increasing role in healthcare, from drafting clinical notes to triaging patient messages, these biases could have significant implications for patient care and resource allocation. The study serves as a crucial reminder of the importance of rigorous testing and continuous improvement of AI systems in healthcare to ensure equitable and accurate medical advice for all patients.

AI Medical Advice Skewed by Typos, Slang, and Gender Bias, MIT Study Reveals

AI Models Exhibit Bias in Medical Recommendations

Impact of Text Alterations on AI Recommendations

Gender Bias in AI Medical Advice

Methodology and AI Models Tested

Implications for Healthcare AI

Future Research and Recommendations

References

Typos and slang spur AI to discourage seeking medical care

Typos, Slang Trip Up AI Medical Assessments

Slang, spelling errors derail AI in medical exams

Related Stories

AI Language Models Prioritize Helpfulness Over Accuracy in Medical Contexts, Study Reveals

AI Chatbots Vulnerable to Medical Misinformation, Study Reveals

AI Summaries Show Gender Bias in Medical Care, UK Study Reveals

Recent Highlights

Grok generates sexualized images of minors and women as X blames users, not the AI model

Nvidia launches Vera Rubin platform at CES 2026, promising 10x cost reduction for AI computing

OpenAI launches ChatGPT Health as 230 million users seek AI-generated health advice each week

Recent Highlights

Today's Top Stories

Google transforms Gmail with AI Inbox, search overviews, and proofreading tools

Google and Character.AI settle first major lawsuits over teen suicide linked to AI chatbots

Stanford's SleepFM AI predicts future disease and mortality years before diagnosis using sleep data

Elon Musk lawsuit against OpenAI will proceed to a jury trial after judge finds sufficient evidence