3 Sources
3 Sources
[1]
These medical X-rays are all deepfakes -- and they fool even radiologists
Most radiologists struggle to identify X‑ray scans that are generated by artificial intelligence, with fewer than half spotting synthetic images hidden in real medical data, according to research published today in Radiology. Large language models (LLMs) also had a hard time picking out real versus synthetic medical images. The study provides training to help radiologists to improve their skills in detecting AI-generated X-rays. Researchers also warn that the negative impacts of synthetic data could creep into scientific literature and medical litigation. "The results from this study are both disturbing and not very surprising to me," says Elisabeth Bik, a microbiologist and image-integrity specialist based in San Francisco, California. "This raises concerns not only for research integrity, but also for clinical workflows, insurance claims and legal contexts where imaging evidence is used." In the study, 17 radiologists from 12 research centres were presented with X-ray scans -- half were real scans, and half were generated by AI. Without knowing the purpose of the study, participants were asked about the technical quality of the AI images and whether they noticed anything unusual. 41% raised concerns that AI scans might have infiltrated the data set. The radiologists were then informed that some of the images were AI-generated and asked to discern real scans from those created by ChatGPT. The participants correctly identified the AI and real scans 75% of the time, on average. Importantly, "there was no difference based on the experience of the radiologists", who had between zero and 40 years of professional experience, says study co-author Mickael Tordjman, a radiologist at the Icahn School of Medicine at Mount Sinai in New York. The research team also investigated whether AI models such as ChatGPT and Gemini might have a more discerning eye than the radiologists, but they were only 57-85% accurate when teasing apart the real and ChatGPT-generated images. That AI‑generated images are so hard to detect might pose significant challenges for the scientific community. Some of the most obvious are that litigations and insurance claims might be compromised by fake images, Tordjman says. There's also the risk that synthetic scans will infiltrate the scientific literature, he adds. Peer reviewers and journals have been inundated with papers containing AI-generated images and medical data, despite policies from publishers that ban or limit their use. And researchers worry that AI-generated content containing inaccuracies is misleading both for scientists and the public. Another concern is that the AI models being used to read medical imaging data could become distorted by AI-generated data in their training data sets. It could cause the models to "latch onto features that are not exactly relevant to real medical cases, but are purely artefacts of generative AI models", says Siwei Lyu, who researches media forensics at the University at Buffalo in New York. At the same time, researchers have found that synthetic data improve the performance of radiographic AI models. More lifelike AI radiographs could ultimately be good news for the medical community if they fill in gaps in AI models' training data, says Curtis Langlotz, a radiologist at Stanford University in California. The study authors have created an interactive quiz that aims to teach researchers how to discern between AI-generated and real X-ray scans. The quiz covers tell-tale characteristics of AI-generated radiographs, such as overly smooth bones and unnaturally straight spines. As well as training radiologists, the authors note, techniques such as digital watermarking could make it easier to identify real images, as could regulatory frameworks from governmental agencies.
[2]
Deepfake X-rays can deceive radiologists and AI systems
Radiological Society of North AmericaMar 24 2026 Neither radiologists nor multimodal large language models (LLMs) are able to easily distinguish artificial intelligence (AI)-generated "deepfake" X-ray images from authentic ones, according to a study published today in Radiology, a journal of the Radiological Society of North America (RSNA). The findings highlight the potential risks associated with AI-generated X-ray images, along with the need for tools and training to protect the integrity of medical images and prepare health care professionals to detect deepfakes. The term "deepfake" refers to a video, photo, image or audio recording that appears real but has been created or manipulated using AI. Our study demonstrates that these deepfake X-rays are realistic enough to deceive radiologists, the most highly trained medical image specialists, even when they were aware that AI-generated images were present. This creates a high-stakes vulnerability for fraudulent litigation if, for example, a fabricated fracture could be indistinguishable from a real one. There is also a significant cybersecurity risk if hackers were to gain access to a hospital's network and inject synthetic images to manipulate patient diagnoses or cause widespread clinical chaos by undermining the fundamental reliability of the digital medical record." Mickael Tordjman, M.D., lead study author, post-doctoral fellow, Icahn School of Medicine at Mount Sinai, New York Seventeen radiologists from 12 different centers in six countries (United States, France, Germany, Turkey, United Kingdom and United Arab Emirates) participated in the retrospective study. Their professional experience ranged from 0 to 40 years. Half of the 264 X-ray images in the study were authentic, and the other half were generated by AI. Radiologists were evaluated on two distinct image sets, with no overlapping between the datasets. The first dataset included real and ChatGPT-generated images of multiple anatomical regions. The second dataset included chest X-ray images-half authentic and the other half created by RoentGen, an open-source generative AI diffusion model developed by Stanford Medicine researchers. When radiologist readers were unaware of the study's true purpose, yet asked after ranking the technical quality of each ChatGPT image if they noticed anything unusual, only 41% spontaneously identified AI-generated images. After being informed that the dataset contained synthetic images, the radiologists' mean accuracy in differentiating the real and synthetic X-rays was 75%. Individual radiologist performance in accurately detecting the ChatGPT-generated images ranged from 58% to 92%. Similarly, the accuracy of four multimodal LLMs-GPT-4o (OpenAI), GPT-5 (OpenAI), Gemini 2.5 Pro (Google), and Llama 4 Maverick (Meta)-ranged from 57% to 85%. Even ChatGPT-4o, the model used to create the deepfakes, was unable to accurately detect all of them, though it identified the most by a considerable margin compared to Google and Meta LLMs. Radiologist accuracy in detecting the RoentGen synthetic chest X-Rays ranged from 62% to 78% and the LLM models' performance ranged from 52% to 89%. There was no correlation between a radiologist's years of experience and their accuracy in detecting synthetic X-ray images. However, musculoskeletal radiologists demonstrated significantly higher accuracy than other radiology subspecialists. The study identified common features of synthetic X-rays. "Deepfake medical images often look too perfect," Dr. Tordjman said. "Bones are overly smooth, spines unnaturally straight, lungs overly symmetrical, blood vessel patterns excessively uniform, and fractures appear unusually clean and consistent, often limited to one side of the bone." Recommended solutions to clearly distinguish real and fake images and help prevent tampering include implementing advanced digital safeguards, such as invisible watermarks that embed ownership or identity data directly into the images and automatically attaching technologist-linked cryptographic signatures when the images are captured. "We are potentially only seeing the tip of the iceberg," Dr. Tordjman said. "The logical next step in this evolution is AI-generation of synthetic 3D images, such as CT and MRI. Establishing educational datasets and detection tools now is critical." The study's authors have published a curated deepfake dataset with interactive quizzes for educational purposes. Radiological Society of North America Journal reference: Tordjman, M., et al. (2026). The Rise of Deepfake Medical Imaging: Radiologists' Diagnostic Accuracy in Detecting ChatGPT-generated Radiographs. Radiology. DOI: 10.1148/radiol.252094. https://pubs.rsna.org/doi/10.1148/radiol.252094
[3]
AI-generated medical scans prove nearly indistinguishable from reality
AI-generated X-rays can now appear convincing enough to mislead even expert radiologists, according to a study published March 24 in Radiology, the journal of the Radiological Society of North America. The research found that radiologists were able to correctly identify whether X-ray images were real or synthetic only 75% of the time on average, despite knowing that fake images were included in the dataset. The study involved 17 radiologists from 12 research centers across six countries: the United States, France, Germany, Turkey, the United Kingdom, and the United Arab Emirates. Participants reviewed 264 X-ray images, split evenly between authentic scans and AI-generated ones. Before learning the study's true aim, only 41% of them independently suspected that some of the images may have been produced by artificial intelligence. Lead author Mickael Tordjman, a radiologist at the Icahn School of Medicine at Mount Sinai in New York, said the findings show how far synthetic medical imaging has advanced. "Our study demonstrates that these deepfake X-rays are realistic enough to deceive radiologists, the most highly trained medical image specialists, even when they were aware that AI-generated images were present," he said. Performance varied widely among the radiologists, with individual scores ranging from 58% to 92%. The researchers found no meaningful relationship between detection ability and professional experience, which ranged from no experience to 40 years. Musculoskeletal radiologists performed better than other subspecialists. The team also tested four multimodal large language models, including GPT-4o and Gemini 2.5 Pro, which achieved accuracy rates between 57% and 85% when assessing ChatGPT-generated images. The implications extend well beyond image interpretation. Tordjman said the results point to a serious vulnerability in areas such as legal disputes and hospital cybersecurity, including scenarios in which fake fractures could be used in fraudulent claims or synthetic scans could be inserted into medical systems to influence diagnoses. Outside experts also raised concerns. Elisabeth Bik, a microbiologist and specialist in image integrity, told Nature that the findings were "both disturbing and not very surprising," and said the risks affect research integrity, clinical workflows, insurance claims, and legal proceedings that rely on imaging evidence. Tordjman noted that some synthetic images still contain subtle warning signs, including bones that appear overly smooth, spines that look unnaturally straight, and blood vessel patterns that seem too uniform. Still, the study suggests that visual inspection alone is no longer a reliable safeguard. The authors called for countermeasures such as invisible watermarking and cryptographic signatures applied at the moment images are captured. The researchers also warned that the current problem may be only an early stage of a broader challenge. "We are potentially only seeing the tip of the iceberg," Tordjman said, pointing to AI-generated 3D scans such as CT and MRI as a likely next step. He said building training datasets and detection tools now will be essential before those threats become even harder to manage.
Share
Share
Copy Link
A new study reveals that radiologists correctly identify AI-generated X-rays only 75% of the time, even when aware synthetic images are present. The research involved 17 radiologists from six countries and tested both ChatGPT and RoentGen-generated scans. Experts warn of serious implications for fraudulent litigation, hospital cybersecurity, and research integrity as synthetic medical imaging becomes indistinguishable from reality.
Radiologists struggle to distinguish deepfake X-rays from authentic medical scans, achieving only 75% accuracy on average even when explicitly told that artificial intelligence-generated images were mixed into their datasets, according to research published in Radiology by the Radiological Society of North America
1
2
. The study involved 17 radiologists from 12 research centers across six countries—the United States, France, Germany, Turkey, the United Kingdom, and the United Arab Emirates—who reviewed 264 X-ray images split evenly between real scans and AI-generated medical scans3
. Before learning the study's true purpose, only 41% of participants spontaneously suspected that synthetic medical imaging had infiltrated the dataset2
.
Source: Nature
Lead author Mickael Tordjman, a radiologist at the Icahn School of Medicine at Mount Sinai in New York, emphasized the gravity of these findings: "Our study demonstrates that these deepfake X-rays are realistic enough to deceive radiologists, the most highly trained medical image specialists, even when they were aware that AI-generated images were present"
2
. Individual performance ranged dramatically from 58% to 92% accuracy, with no correlation between years of professional experience—which spanned from zero to 40 years—and detection ability1
3
. Musculoskeletal radiologists demonstrated significantly higher accuracy than other subspecialists2
.The research team tested whether large language models might outperform human experts in identifying synthetic images. Four multimodal systems—GPT-4o and GPT-5 from OpenAI, Gemini 2.5 Pro from Google, and Llama 4 Maverick from Meta—were evaluated on the same task
2
. Their accuracy ranged from 57% to 85% when assessing ChatGPT-generated images, and from 52% to 89% when evaluating RoentGen synthetic chest X-rays2
. Notably, even ChatGPT-4o, the model used to create some of the deepfakes, could not accurately detect all of them, though it performed considerably better than competing systems2
.The implications extend far beyond diagnostic accuracy. Tordjman warned of "high-stakes vulnerability for fraudulent litigation if, for example, a fabricated fracture could be indistinguishable from a real one"
2
. Hospital cybersecurity faces new threats if hackers gain access to medical networks and inject synthetic images to manipulate patient diagnoses or undermine the reliability of digital medical records2
3
. Elisabeth Bik, a microbiologist and image-integrity specialist, described the findings as "both disturbing and not very surprising," noting concerns for research integrity, clinical workflows, insurance claims, and legal contexts where imaging evidence is used1
3
.
Source: News-Medical
Another concern involves AI models trained on corrupted datasets. Siwei Lyu, who researches media forensics at the University at Buffalo, explained that models could "latch onto features that are not exactly relevant to real medical cases, but are purely artefacts of generative AI models"
1
. Peer reviewers and journals have already been inundated with papers containing AI-generated images despite policies that ban or limit their use1
.Related Stories
The study identified telltale characteristics of synthetic images that can help with detection. "Deepfake medical images often look too perfect," Tordjman explained. "Bones are overly smooth, spines unnaturally straight, lungs overly symmetrical, blood vessel patterns excessively uniform, and fractures appear unusually clean and consistent, often limited to one side of the bone"
2
3
. The study authors created an interactive quiz covering these characteristics to train researchers on discerning between AI-generated and real X-ray scans1
2
.To protect medical imaging integrity, researchers recommend implementing advanced digital safeguards. Digital watermarking can embed ownership or identity data directly into images, while cryptographic signatures linked to technologists can be automatically attached when images are captured
1
2
3
. Regulatory frameworks from governmental agencies could also help establish standards for identifying authentic medical images1
.Tordjman cautioned that current challenges may represent only the beginning: "We are potentially only seeing the tip of the iceberg. The logical next step in this evolution is AI-generation of synthetic 3D scans (CT and MRI). Establishing educational datasets and detection tools now is critical"
2
3
. As synthetic medical imaging technology advances, the medical community must prepare for increasingly sophisticated deepfakes that could compromise patient care, legal proceedings, and scientific research if left unchecked.Summarized by
Navi
[2]
07 May 2025•Science and Research

06 Jun 2025•Health

27 Dec 2025•Science and Research

1
Technology

2
Technology

3
Technology
