This transcript has been edited for clarity.
Do you ever stop and pay attention to the first diagnostic code patients get when they end up in the hospital or emergency department, and then compare it to what they actually wind up having?
When the First Code Isn't the Right Code
Pediatric hospitalist here, and I lose track of how many times I see an ICD-10 code for "cough, unspecified."
Then, hospital billing emails us and says, "Hi, can you please change that to acute and chronic respiratory failure with hypoxia?"
I mean, it makes sense.
Now, "cough, unspecified" may not be technically wrong. It's just too broad. But sometimes the diagnostic accuracy of codes is straight up wrong.
Why Coding Accuracy Matters Beyond Billing
A recent study in BJS put a flashlight -- more like a police security spotlight -- on diagnostic inaccuracy of ICD-10 codes as they relate to true diagnoses.
It matters because these data are used for research analysis when we're looking at length of stay, quality metrics, and reimbursement. This type of administrative data is used by national organizations, such as CMS, for research.
Okay, let's get into this study. Researchers at UCLA examined records of over 1 million patients, and they focused on hernias. They found about 42,000 diagnostic codes for hernias, such as diaphragmatic, ventral, and inguinal, and then they looked further.
The researchers wanted to confirm how many of those patients truly had hernias, so they looked at abdominal imaging data. They used AI to analyze the dictated radiology reports, and they found in the end that only about 36% of patients truly had hernias.
That's a large amount of inaccurate coding. Almost 1 in 3 patients had a mismatch between what they actually had vs what was stamped in their chart as their reason for admission.
When Broad Labels Miss the Mark
If you think about clinical presentations that initially have a broad differential, such as abdominal pain, altered mental status, or fever, you have to wonder how many were given an initial broad label but then were later found to be ovarian torsion, cerebellitis, or sepsis, and the codes weren't changed. Or it could be bowel obstruction, which ends up being constipation.
I ran into a specific study looking at ICD-10 coding for functional seizure disorder, and the authors found only a 44% positive predictive rate for that coding. Over half of the individuals did not actually have functional seizure disorder.
Were patients taken off antiepileptics or made referrals they didn't need? Possibly. Beyond this, this further proves a foundational issue with relying on ICD-10 diagnostic coding data.
Why These Errors Happen (and Why It's Not Just Clinicians' Fault)
I feel bad playing the blame game because I get it. It makes sense that a clinician codes for a disease or condition upon initial suspicion. We consider a problem whether or not it actually exists.
Groin pain and swelling could be coded as hernia and then be ruled out later, but the error stays in the chart and then this could affect subsequent visits.
These codes were created for billing and tracking, not clinical accuracy, and I suspect that some EMR systems aren't even designed to flag changes.
Can AI Fix the Coding Gap? Early Evidence Suggests Yes
Maybe AI can help us out. In one study from the Mayo Clinic, researchers created 100 simulated nephrology cases, used ChatGPT, and found up to 99% accuracy in AI correctly assigning ICD-10 codes. Maybe there's a world where these natural language processing models can look through charts, be hyperfocused in a specific area, and improve ICD-10 accuracy. The codes will actually match the diagnosis.
Now, obviously, we have a responsibility here as well. We can't just relinquish all the coding to our computer AI overlords, but hey -- it's a potential solution and a step in the right direction.
Action Steps for Better Coding
In the end, there's some work that we as clinicians can do.
There are a few steps:
* We can try to avoid the anchoring bias with our initial ICD-10 codes.
* We can do our own chart audits.
* We can try to code with more precision, as tempting as "headache, not otherwise specified" is.
* We can educate trainees.
* All while pushing for more clinically trained AI.
I think we've proven that ICD-10 coding should not be the gold standard for administrative diagnostic data, so there's work to be done.
I want to hear what you all think. Does anyone out there have specific innovative ways to address this? Have you ever actually stopped to think about it? I'm curious to also hear what specific institutions have done. Comment below.
Alok S. Patel, MD, is a pediatric hospitalist, television producer, media contributor, and digital health enthusiast. He is a clinical assistant professor for the department of pediatrics at Stanford Children's Health in Palo Alto, California. Patel is a special correspondent for ABC News and regularly appears as an on-camera expert for several news outlets. He hosts The Hospitalist Retort video blog on Medscape.