Building a Pandemic Insight Engine

By Gerry Miller, CEO, Cloudticity
Twitter: @Cloudticity

5 keys for a unified technology deployment strategy in a public health emergency

The fragmented nature of health care and public health within the U.S hasn’t helped pandemic response. With the winter season now approaching, things just got worse.

Combatting a new and highly contagious disease presents a wide range of challenging problems. But in many ways, our tools only exacerbate the issue.

Health information technology—and our approach to utilizing it—is holding us back from meeting the challenge effectively.

In the U.S., sharing or transferring data between health systems is clunky at best, largely due to a patchwork system of medical record technologies that don’t always speak the same language. As a result, health authorities don’t have ready access to complete and accurate information. As one national science reporter recently pointed out, America “has a Covid-19 data problem.”

We can better manage pandemic preparation, response, and recovery measures by making better use of data to inform action. And cloud technology lets us allocate nearly unlimited storage in minutes, spin up a virtual supercomputer anytime, and process even the most unpredictable amounts of data.

Developing unified strategies for streamlining data and deploying technology in a public health emergency can create a kind of pandemic insight engine. There are five keys to making it work:

1) Coordination
This first step sounds the most platitudinous, but is actually the most essential—and the most difficult. Coordinating amongst many agencies and data sources in the health IT arena comes with unique complexities, regulatory constraints, patient privacy safeguards, political sensitivities, and pervasive reticence toward quick changes.

Caution rightly rules where real matters of life and death are involved.

Still, the ability to combine information from a critical mass of participating sources is key to generating an operative pool of data to feed insight. That requires active involvement from many constituents who don’t like to share their data.

Getting the necessary organizations—hospitals, health information exchanges (HIEs), state agencies and the IT consultants that service those agencies, all the way up to governors’ offices—to engage and align is no easy task amongst divergent areas of focus and competing agendas.

But the number one goal, before anything technical can be attempted, is getting everybody pointed in the same direction and on the same page.

2) Data rationalization
The second key is to quickly begin flowing data into a consolidated data hub or “source of truth.” And those data need to be clarified.

Just like the healthcare industry as a whole, healthcare data are extremely complicated and fragmented. There are standard formats such as HL7v2 and C-CDAs, but they are not necessarily used the same way by everyone or even consistently within the same organizations. And there are different coding sets. For example, a COVID positive can be recorded as a ICD-10 code that might be U07.1. But the corresponding SNOMED code, which is a different coding system, might be 840539006.

This complexity is compounded by standard data collection errors and deviations: one hospital might record a patient as “Gerald,” and another record the same patient on a different occasion as “Jerry”—how do we know that they’re the same person?

Rationalizing and consolidating a tremendous amount of data into a system that can accept such large-scale flow is key, as is putting it into a format that is correlatable.

It would be infeasible to go to hundreds of hospitals and clinics across, say, the state of California and get each one of them to send pre-consolidated and rationalized data. Luckily, most states have health information exchanges (HIEs) that address this issue. A handful of HIEs can provide cleansed data from thousands of sources.

Undoubtedly, there will still be some data wrangling required; for example, patient identifiers can vary even across HIEs. But formulating a few API calls for matching is trivial compared to rationalizing streams upon streams of raw data from scores of health sources.

Correlating the diversity of cleansed data provided by HIEs also dramatically lowers the number of point-to-point data connections required with individual providers. And if coupled with cloud-based services for normalization and de-duplication of health data, it vastly raises the bar for data quality and establishes reliable datasets.

By utilizing scalable cloud infrastructure, clean streams of data can be flowing into an operational data lake in a matter of days.

3) Useful intelligence
The third key is to then draw useful intelligence out of those rationalized datasets—for a wide array of targets. What defines “useful” amongst these targets will vary.

Clinicians, for example, might be looking for comorbidities (high blood pressure, diabetes, etc.) and demographics as predictors of outcomes. Such broad insight is invaluable for pandemic strategy as mortality rates have skewed egregiously higher among vulnerable populations.

But a hospital administrator, on the other hand, might simply want to know how many patients are likely to show up in the ER in the next 30-days and whether the current inventory of ventilators and ICU beds is going to suffice. Such specific insight on capacity planning and supply chain is invaluable for operational function, particularly in surge conditions.

Both targets illustrate critical insights that can be drawn from sets within the same engine powered by coordinated and rationalized health data.

4) Prevention mechanisms
Moving forward, enhancing pandemic community management and capacity planning will require feeding systems that help prevent spread with more information. The key to better COVID-19 information is contact tracing.

With highly infectious disease, contact tracing requires individuals to give up a little bit of ordinarily private information for the greater good. People are expected to take some degree of responsibility by sharing information that can help stem the tide, prevent exponential spread, and reduce needless deaths.

There are two types of contact tracing: manual and digital.

Most contact tracing efforts thus far in the U.S. are manual. If you get a lab test that’s positive for COVID-19, it triggers a workflow where a designated public health department employee or volunteer will pick up the phone and call you. They’ll explain self-quarantine directives, and have you fill out a form or verbally guide you through listing every person you’ve been in contact with for the past few weeks (without probing for any extraneous details about you and your personal life).

Identified individuals with whom you’ve had contact will then get a call warning them of possible exposure and recommending a test, and the process repeats throughout the network of potential exposure. Information will be recorded in various health applications and databases, but the collection of information is usually via a form or a person-to-person conversation.

The process is covered under HIPAA laws, and the information obtained is regulated for patient privacy protection—your private details are not going to get published in the newspaper from a manual contact tracing interview.

The second type of contact tracing is digital, and generally based on information automatically obtained from an application on individual mobile phones.

Some countries, such as South Korea, mandate digital contact tracing applications in addition to manual tracing. As people move around, records of location and proximity to others are collected and stored via their phones. If an individual tests positive, digital tracing applications automatically access those records and identify mobile phone users whose location and proximity data indicate possible exposure (who also get a push notification).

In the U.S., there are deep-seated individual privacy and autonomy concerns around such government-run digital tracing measures, and varying levels of opposition to mandates. But we’ll eventually get some degree of digital tracing capability through opt-in measures facilitated by industry and led by state or regional health authorities.

Front runners include mobile apps like those supported by the Google-Apple Exposure Notifications System API, which fortifies privacy protections and user control. Such applications rely on advertising and education campaigns to try to get people to enable COVID tracing capabilities on their phones. The choice is left to individuals on whether to participate. But the benefit is that if you were exposed, you’ll get quickly and automatically notified on your phone.

There are additional benefits to feeding such information into an insight engine. As recently outlined on NPR’s Morning Edition: “At scale, the data gathered…offers vital information about where transmission is happening in a community. That data can drive policy and even guide individuals in assessing what’s more or less safe to go out and do.”

5) Pandemic postmortem
At some point, this pandemic will end or come under control.

When that time comes, we’d do well to review and address how this all unfolded—before the next pandemic. What did we do wrong? What did we miss? Where were the gaps or missed opportunities in our health technologies and practices?

There’s already been at least one high-profile Public Health Tech Initiative working group launched (with participation by Microsoft, Facebook, CVS Health, and many others) to “study how technology has been deployed—both successfully and unsuccessfully—during the current crisis and develop a set of recommendations for future public health emergencies.” Over time, there will be many more similar efforts.

Pandemic insight engines should not be abandoned once this threat abates. We need to learn from this experience. We can continue collecting and cleansing important health data, and continue using and improving coordinated technology to get ahead of the curve before the next COVID arises.