In Defense of Dirty Data in Healthcare

By Steve Spearman, Founder and Chief Security Consultant for Health Security Solutions
Twitter: @HIPAASolutions
Host of HIPAA Chat – Register for Aug. 21 Event

Conventional wisdom says that you’re trying to work with a patient’s health care analytics, you’ve got to use “clean data”. For those who are unfamiliar with the term, “clean data” is personal data your patient has generated on social media, but with all of the irrelevant or contradicting data removed in an attempt to make get the most accurate health information possible.

But in this article, John Showalter, MD, chief health information officer at the University of Mississippi Medical Center, speaks up in defense of dirty data in healthcare.

But why would a chief health officer say that dirty data is better than clean data? Remember that “clean data” is actually code for “edited” data: with all of the irrelevant and contradicting data removed. But the problem with that is how to decide what is relevant data and what isn’t.

For example, suppose you have a patient who regularly tweets about sitting around the house, but also retweets exercise tips almost every week. That could mean that the patient wants to exercise, but has a hard time making a lasting commitment. Clean data would say that she leads a sedentary life, and disregard the exercise tips as a minor contradiction to the data.

But dirty data would consider the exercise as a relevant part of the data. For example; maybe she would exercise more regularly if she found an exercise regimen that she enjoys. But clean data would ignore the exercise tips, since they contradict the data.

The article itself goes into more detail about the benefits of using dirty data instead of clean data, but here are his six main reasons for defending dirty data in healthcare:

  1. improve the health of a population by improving the health of one individual.
  2. keep valuable data if they would lose if they “clean” it too well.
  3. get more detailed information on patients, even finding “geo-code” information, such as their home address, their work address, and if they pass a pharmacy on their daily commute.
  4. share data with a local partner, such as a hospital, vendor, or clinic.
  5. use dirty data to supplement their EHRs.
  6. outsource their dirty data to gain high quality predictive analytics.

Read the original article to see how each of these six reasons has helped the University of Mississippi Medical Center has put their dirty data in healthcare to good use.

This article was originally published on Health Security Solutions and is republished here with permission.Steve Spearman hosts HIPAA Chat, a show produced by HITECH Answers airing on our Internet radio station, HealthcareNOWradio.com. Learn more about HIPAA Chat or download podcasts of the show. Find out more about attending the next taping of HIPAA Chat and ask your questions directly to Steve.