ETL Challenges within Healthcare Business Intelligence

bradbensonDeveloping Your ETL (Extract-Transform-Load) Solution

By Brad Benson, CEO Health eFormatics
Twitter: @EMRConversion

Healthcare delivery reforms are putting pressure on healthcare organizations to reduce costs and improve quality and efficiency. ACOs need to reduce readmissions and demonstrate improvements in quality standards in order to avoid CMS penalties. Providers need to be able to better understand performance metrics in order to predict future outcomes. The single most effective solution to achieving these results is through business intelligence and predictive analytics.

However, much of the data required in order to perform big data analysis resides in disparate, non-integrated electronic systems. In order to turn this data into actionable intelligence, there needs to be a holistic view of the data, which oftentimes requires ETL (Extract-Transform-Load) to consolidate healthcare data into a data warehouse. This paper explores the challenges and risks involved with ETL, and best practices to abide by when developing your ETL solution.

Data Consolidation Strategy

The key to being able to fully leverage your BI investment to the fullest is having a strategic plan in place for bringing the disparate data from isolated systems into a consolidated view. Having only a partial view of your data is not going to reveal the full story the data has to tell. By bringing all of your financial, clinical, and claims data into a single view, this will allow you to create the necessary dashboards and scorecards you need in order to turn your data into meaningful information that you can act on. The place to develop this strategy of data consolidation is within the ETL process, also known as the Data Delivery process.

Challenges with Data Consolidation

Bringing your data together through an ETL process can often bring unforeseen challenges, and knowing how to effectively deal with those challenges could be the difference between a successful BI implementation and a failed implementation. Chief among the challenges is not only delivering the data from numerous systems to a consolidated view, but bringing the data together in a meaningful way.

The key ETL challenges involved with Healthcare BI include:

  • Difficulty in accessing data from numerous systems
  • Quality of the data within the systems
  • Inconsistency of the data across systems

In order for a BI solution to provide meaningful, actionable intelligence, all three of these challenges need to be overcome and addressed within your ETL strategy.

Challenge #1: Difficulty in Accessing Data

An ACO may have 20 or more EHR systems being used throughout the organization. Some may be in-house systems, whereas others may be hosted by the EHR vendor. One challenge is that most organizations do not have sufficient knowledge of all of the backend databases of each of these EHRs in order to bring it all together. Because this in-house knowledge is lacking, one solution is to outsource this effort to EHR conversion companies who do have knowledge of most of the various EHR systems on the market.

Also, not all EHR vendors are willing to share the data and allow the data to be extracted from the EHR system, or they may not be willing to provide it at a reasonable price. This is really unfortunate, but it does happen, so organizations need to think about how they will handle these situations when they occur. Organizations need to use whatever leverage they can, including legal action if necessary, to ensure their data is made available to their reporting environment.

Not only do ACOs need to consolidate EHR data, but they need to bring their financial, claims, and operational data together as well in order to leverage BI to its full potential for the organization. All three of these types of information will invariably come from different sources which can be either internal or external to the organization. Most electronic systems have extraction capabilities, but there are many different formats that the data may be received in. Your ETL process needs to be robust enough to deal with all of the potential formats.

Challenge #2: Data Quality

Even if you are able to access and extract all of the data from the disparate systems, the quality of that data may not be what you thought it was. Oftentimes, first-generation EHRs that appeared to be capturing discrete coded data were actually nothing more than glorified word processors, storing clinical data non-discretely within documents, making that data unusable for reporting purposes.

Also, upon closer inspection of the data, the organization may realize that a lot of the data being entered into the system was not being stored as codified values, but merely as free-text information. Mapping those free-text values across systems so that they can be reported consistently and represented meaningfully across an entire organizational view could be an arduous, resource intensive process.

Your ETL strategy should do what it can to attempt to turn bad data into good data. This is not always easy, but there are several things that can be done, including algorithmic data mapping and natural language processing to turn this data into valuable information that can and should be included as part of your BI solution.

Challenge #3: Inconsistencies in the Data

Physicians and nurses will oftentimes document in an EHR in very different ways. Because EHRs allow similar types of information to be stored in multiple areas within the system, a challenge occurs when an organization looks to report on this information. What an EHR system gains in flexibility when entering information, it loses in consistency when that data needs to be extracted or reported against. The ETL strategy needs to ensure to reconcile these inconsistencies and turn inconsistent data into meaningful information.

Also, when bringing data together from various EHR systems, there will undoubtedly be vast differences in the information across the EHRs. Not only will each call the same data elements something completely different, but they will also each store information that may or may not have an equivalent in the other EHRs. No two EHRs are the same, but the job of the ETL strategy is to make the data across them as similar as possible in a united view so that meaningful information can be derived from these vastly different EHR systems.


The first step in overcoming these challenges is simply knowing that they exist, and realizing that you will likely encounter several of them when creating your ETL solution. Making sure that they are not overlooked and are included as part of your ETL strategy will put you in a position to fully leverage the immense power of BI and predictive analytics within your organization.

About the Author:  Brad Benson is the CEO of Health eFormatics, a company that specializes in EMR/EHR data conversions.