Why AI Coding Will Keep Failing Until Someone Fixes the Clinical Input Problem

By Nathan Turock, CEO, VerifyMedCodes
LinkedIn: Nathan Turock, MHA, BSBM, PCHA
LinkedIn: VerifyMedCodes.com

The revenue cycle management industry has spent a decade and billions of dollars building better billing engines, smarter clearinghouses, more sophisticated denial management tools, and most recently autonomous AI coders. And yet, 450 million claims are still denied every year in the United States. $125 billion in revenue is still lost annually to poor documentation. 80% of medical bills still contain errors. The problem is not getting better. It is getting more expensive to manage.

The reason is straightforward, and almost nobody is talking about it: we have been solving the wrong problem.

The Layer Nobody Built

Every RCM platform in the US operates on a foundational assumption that the clinical note entering the pipeline accurately and completely represents the care that was delivered. That assumption is systematically wrong.

Clinical notes are unstructured narrative documents. A progress note for a 72-year-old male with CHF, diabetes, and CKD might document dyspnea in the HPI, reference a BNP of 640 in the labs section, list “CHF exacerbation, stable COPD” in the assessment, and increase furosemide in the plan across four separate sections, using abbreviations, with no linkage between the clinical evidence and the diagnoses, and with a potential assertion conflict buried between the HPI and assessment that no billing system catches before the claim goes out.

What happens to that note today? It enters the RCM pipeline as a raw narrative. A human coder or increasingly, an AI coder interprets it as best they can. Dyspnea gets billed alongside CHF even though it is explained by CHF and should be suppressed. The HCC category collapses because CKD stage 3 was documented but not extracted at the specificity required. The assertion conflict between “exacerbation” in the HPI and “stable” in the assessment produces an indefensible code. The claim goes out, gets denied, and costs $30 to rework.

Multiply this across 1.8 billion clinical notes per year, and the math is not complicated.

What Is Missing and Why It Has Not Been Built

The market has built tools at every other layer of the revenue cycle workflow. Ambient scribes convert physician-patient conversations into clinical notes. Autonomous coders assign ICD-10 codes to finished notes. Denial management tools fight rejected claims. Revenue analytics platforms track payer behavior patterns. None of them address the extraction layer: the deterministic conversion of finished narrative documentation into verified, structured, billing-safe clinical state before coding begins.

This layer has not been built commercially for three reasons:

It is genuinely hard. Building a deterministic rules engine that handles 574,000 medical concept mappings, 851 symptom suppression relationships, 8,000+ HCC preservation guard mappings, and 50+ ICD family specificity resolvers takes years of domain-specific engineering. You cannot assemble it from a large language model and a public SNOMED dataset.
The market has been distracted by the coding layer. Autonomous coding companies have raised hundreds of millions of dollars to replace the coder. Nobody has raised money to fix what the coder receives.
PHI creates a false barrier. Most approaches to note processing assume PHI access is required. It is not, if you design the architecture correctly from the start.

The Technical Requirements

What does a proper clinical extraction layer actually require? Based on the failure patterns that drive denials, a production-grade extraction layer needs to handle at minimum:

Assertion detection: identifying whether a diagnosis is present, absent, historical, or hedged within the clinical narrative
Symptom suppression: recognizing when a symptom is explained by a higher-order diagnosis and should not be billed independently
HCC preservation guard: ensuring that specificity drift does not collapse a correctly documented HCC category to an unspecified code
Evidence tethering: linking every extracted code to the exact sentence, lab value, or clinical element in the note that supports it
Assertion conflict resolution: detecting contradictions across note sections before they produce indefensible submitted codes
PHI-free architecture: processing clinical language patterns without transmitting patient-identifying information
Deterministic reproducibility: the same note must produce the same output every time, without sampling variance

The last requirement is the most important and the most underappreciated. Every AI coding system on the market today is probabilistic. It samples from a probability distribution. Run the same note through an AI coder twice and you may get two different codes. In a CMS audit, that is not a defensible architecture. Deterministic rules-based extraction produces the same output for the same input, every time. That is auditable. That is defensible. That is what clinical documentation integrity actually requires.

What This Means for Revenue Cycle Leaders

The practical implications for RCM firm leadership, VP Revenue Cycle executives, and health system CFOs are significant:

Every denied claim that traces back to documentation, including assertion conflicts, overcoded symptoms, specificity failures, and HCC collapse, is preventable upstream. The appeal cost, rework cost, and payment delay are all downstream consequences of a solvable upstream problem.
AI coding accuracy is bounded by input quality. A probabilistic coder receiving structured, validated, evidence-linked clinical state outperforms the same coder receiving raw narrative. The extraction layer improves every downstream tool in the pipeline.
PHI-free architecture changes the compliance calculus entirely. A clinical extraction layer that processes de-identified clinical patterns, not patient records, shortens legal review cycles from months to days and eliminates the HIPAA compliance surface that makes competing tools difficult to deploy.
The audit trail is the new competitive differentiator. As CMS and commercial payers increase scrutiny of AI-generated codes, the ability to show exactly which sentence in the note supported each submitted code is not a nice-to-have. It is a compliance requirement.

What a Fully Staffed Upstream Layer Produces

The financial impact of stabilizing clinical input before billing begins is measurable and significant. For a medium-size RCM firm processing 1 million notes per year at $180 average reimbursement:

The Moment the Market Is At

Waystar’s $1.25 billion acquisition of Iodine Software in October 2025 confirmed that the market pays premium valuations for clinical documentation intelligence. Iodine’s AI-powered CDI platform serves 1,000 enterprise hospitals and health systems. It is an impressive product. It also requires PHI, requires EHR integration, and cannot serve the 1.4 million independent providers and small group practices that represent the majority of the US physician workforce.

The market is converging on the extraction problem from multiple directions. Ambient scribes are building “contextual reasoning engines” that produce billing-aware notes. Autonomous coders are seeking better input quality. RCM platforms are adding documentation intelligence modules. None of them have built the PHI-free, deterministic, SNOMED-mapped, audit-ready extraction layer that sits upstream of all of them.

That layer is the missing piece. The RCM industry built everything around it. We built the missing layer.