Buy or Build? Determining the Right Approach to NLP Adoption in Healthcare

By Calum Yacoubian, MD, Associate Director NLP Healthcare Strategy, Linguamatics, an IQVIA company
Twitter: @Linguamatics

The exponential growth in healthcare data in recent years has presented the industry with an ultimatum – adopt effective technologies to harness and make sense of growing mountains of information or get left behind.

Vendors across the broad domain of healthcare IT, from care management to clinical documentation improvement, are increasingly looking to improve their offerings and customer experience by adopting natural language processing (NLP) techniques to tackle these unstructured data.

NLP enables the large-scale mining of text through its ability to “read” natural language. It simulates the human ability to understand the nuances of written language, enabling the analysis of huge amounts of text-based data without fatigue in a consistent, unbiased manner.

Effective NLP goes beyond just identifying words and entities, but rather gives computers the ability to read, understand and interpret the human language in which they are written. It takes the synchronization of many moving parts and techniques for NLP to accurately make sense of the text it is fed, but when working properly – it can be truly transformative.

As healthcare IT vendors look to incorporate NLP into their platforms, they are confronted with the decision of how best to advance their NLP capabilities: build something themselves or partner with an outside organization that is already proven in the space.

Understanding the problem
Before deciding on the best approach for adding technology that addresses clients’ needs, health IT leaders must first understand the scope of healthcare’s data problems – and the opportunities for addressing those challenges.

The scale of healthcare data is staggering. Healthcare organizations have seen an explosive health data growth rate of 878% since 2016, reaching 8.41 petabytes (PB) on average, according to Dell EMC. Approximately 30% of the world’s data volume is generated by the healthcare industry, and its rate of growth outpaces that of other industries such as manufacturing, financial services, and media & entertainment, according to RBC Capital Markets.

On average, a hospital purchases 50 petabytes of data every year, according to the World Economic Forum. To get an idea of the scale, consider that one petabyte of data is equal to 11,000 4K movies – which illustrates just how big “big data” is within healthcare.

Compounding the problem is that healthcare data is not only big, it is also messy with 80% of healthcare data believed to be unstructured. To complicate matters further, within an average healthcare system the data resides in multiple disparate systems. These three factors provide significant explanation as to why the majority of healthcare data is never reused after its initial creation.

To derive value from this complex pool of data and enable its re-use– the question for healthcare organizations is not if they need NLP, but how best to deploy it.

Option 1: Build it yourself
Health IT vendors looking to start from scratch with NLP often first consider available options on the open-source market, which include a wide range of components to help build NLP tools. However, vendors taking this approach also require expertise from NLP specialists and linguists.

Assuming those pieces are in place, the primary benefit of the build-it-yourself approach is having complete flexibility to create something that is perfectly suited to the organization’s specific needs. Additionally, the solution can be deployed on premise, to mitigate concerns around data sharing. Though support may be limited, some does exist in the open-source community. And given that the solution is built around open-source components, software costs are low.

However, where software costs are low, the expenses related to having the appropriate personnel to build the NLP pipelines are high. Further, with full control comes full responsibility if problems arise. There is nothing resembling the first-line support or product enhancement schedule that comes with established partners. When it breaks, the buck stops with you, which can be challenging given the other workflows and other solutions most healthcare organizations must maintain.

Option 2: Partner with an established healthcare NLP player
The proliferation of healthcare data in recent years has likewise spurred growth in companies that provide NLP, including startups and established players, so there are lots of options to consider for health IT vendors that prefer to partner. With this approach, the obvious benefit is obtaining a fully developed solution that is ready to go, and often has been created to solve a specific problem.

If that problem is the one that the organization is looking to address, an out-of-the-box solution may provide an opportunity to solve it quickly. The partner owns the product and delivers first-line support, removing the need for in-house NLP expertise to get results.

In this scenario, the downside is less transparency in that the solution is often essentially a black box – giving health IT vendors little ability to truly understand the “what” and “why” behind what’s happening inside. Additionally, these plug-and-play tools may be ill-suited for other use cases beyond those for which they are designed. Finally, costs may be significant – for instance in examples where the NLP is charged on a “per unit” or volume basis, repeated processing of hundreds of thousands or millions of documents can quickly become cost prohibitive.

Is it possible to have the “best-of-both worlds”?
There is an option that combines the best of the build and buy approaches – flexibility and transparency with reliability and robustness. In this hybrid approach, health IT vendors leverage an established NLP expert’s platform, taking advantage of the already proven NLP pipelines, managed roadmap and support, and uses this partner’s platform to build what matters to them.

By combining their subject matter expertise with a solution that is flexible enough to enable tuning and configuration to their specific enterprise needs, healthcare organizations can focus on solving their customers’ problems, rather than building a new NLP solution. They can do so with the reassurance that the NLP solution is “open box” and auditable – giving them confidence in the results.

The decision to adopt NLP tools may be easy for healthcare IT organizations seeking to manage escalating volumes of data. For healthcare vendors wanting to incorporate NLP into their platforms, they must thoughtfully evaluate available internal resources and market options to determine if a build, buy, or hybrid approach best suits their needs and the needs of their customers.