Power in Data: Value Against Privacy

By Matt Fisher, General Counsel, Carium
Twitter: @matt_r_fisher
Twitter: @cariumcares
Host of Healthcare de Jure – #HCdeJure

Data are the new currency. That saying has become entrenched as a primary cliche describing the way of the world nowadays. The value of data has become especially true in healthcare. So many see opportunities to be extracted, but what about considerations of privacy and the impact on individuals?

Pulling Back the Curtain

A recent expose from Stat News on the MarketScan database is throwing more light onto the issue. Reading the full article is highly recommended. For a brief summary though, the database began to be compiled in the early 1980s with the aim of giving companies more insight into healthcare use to enable better control over healthcare costs. The database kept growing as the originating company brought on more customers, which ultimately resulted in incorporating information about roughly 270 million individuals. The data were aggregated and de-identified. Per the individual who came up with the idea for the database and started it, he intended for the information to be treated very carefully and held with strong protections. To that end, the database was subject to a number of security controls and efforts to maintain privacy.

At this point, it is important to provide a reminder that HIPAA did not exist in the early 1980s. The landscape was a lot freer in terms of controls and expectations. From that perspective, the reported desire of wanting to maintain privacy and security was potentially ahead of its time, though still coming from the business perspective. The world was also less interconnected back in the early 1980s prior to easy, general internet access being available and the explosion of data creation. From so many of those perspectives, it was a simpler time before the advent of many technology-related complications.

Evolution of the Database

As the database acquired information about more individuals and data itself became more valuable, interest shifted from using the insights to control costs to creating value from the data itself. As a result, the database has shifted hands a number of times and for ever-increasing sale amounts. In fact, the database may be more valuable than the uses that it theoretically enables.

Why is the value of the database increasing? Because the database is seen as enabling the development of artificial intelligence (AI) or machine learning (ML) systems. As explained in the Stat News article, the MarketScan database most recently was used in the attempted development of Watson Health. The need for large quantities of data to train AI and/or ML systems is asserted almost every time a new such tool is proposed or created. When a database with extensive information about so many individuals is available, all try to chase it down and get access. That is likely why even though Watson Health did not pan out as anticipated, the resale of the database was the most attractive asset as it was recently broken apart.

What About Privacy?

When the database keeps changing hands, what happens to the privacy of all of the individuals whose information is contained in the database? The answer to that question is not really known right now. When the database was originally created, the same concern about privacy probably did not exist. In the early 1980s it would not have been easy to combine the database with other sources of data and potentially (or very likely) re-identify the information to tie it back to particular individuals. The isolated database (if it was truly de-identified and walled off) would have represented only the value that it was originally intended for.

The constant creation and collection of data today change the picture though. Now, data can from so many different sources can be combined and truly anonymous or de-identified data are hard to come by (at least according to some assertions). Regardless of the view on whether data can be wholly divorced from identification, privacy risks are unavoidable from the constant movement of such a large database. One of the biggest concerns is that individuals are probably completely in the dark about being included in the database and will never know the extent to which information about them is included.

Individual Rights and Inclusion

The growing push from more individuals to retain control over their data or benefit when their data are used raises the question of how large databases should be created and then monetized. Is individual consent necessary? Should some value be given back to individuals since the databases are exploited for financial gain? Answering each of those questions and other ones will not be easy.

Before considering what should be, it is helpful to know what the existing regulatory landscape permits. Since healthcare information is being considered, the first regulation to consider is HIPAA. The Privacy Rule under HIPAA establishes how protected health information (PHI) can be used and disclosed. In particular, the sale of PHI requires consent or authorization of the individual. However, there is a big caveat to that statement. Authorization for sale is only needed for PHI. The Privacy Rule also sets out how PHI can be de-identified. If PHI is de-identified in accordance with the requirements of the Privacy Rule, then the de-identified information is no longer PHI and no longer subject to HIPAA. Once the de-identified information is not subject to HIPAA, then no authorization is needed for the sale of the data or any other use.

What about other privacy laws? Many states are enacting their own privacy laws and the full impact is not really known yet. The impact may not be as great as anticipated though if the data are already subject to HIPAA. In many instances, the state laws do not subject to organizations to doubled up compliance obligations, which means HIPAA still controls and the greater protections from the state law don’t apply. The big point there is to know how the data are being created, namely from the “traditional” healthcare realm or newer, consumer-focused sources that are not subject to HIPAA.

Leaving aside whether consent or authorization is needed, a fundamental question is whether individuals should also benefit from the ongoing use of data. Inclusion of individuals from that perspective has not occurred to date, but some are pushing for that to occur. To hazard a guess, passing through of benefit should not be expected absent some form of legal or regulatory change. If data can be used broadly and in compliance with applicable requirements, then it should be expected to continue as is.

Furthering a Discussion

The focus on large databases and other uses of individual information should be how privacy needs to evolve. The rapid expansion of technology in various forms created new interactions and realities with an almost unprecedented speed that is not yet fully understood. An almost real-time debate is starting to occur, but may not be happening quickly enough. Placing attention on the issue is helpful and all should make their voices heard.

Thinking in that direction, what are your thoughts on how data should or can be compiled, manipulated, and carried forward? Let’s have the discussion.

This article was originally published on The Pulse blog and is republished here with permission.

Pulling Back the Curtain

Evolution of the Database

What About Privacy?

Individual Rights and Inclusion

Furthering a Discussion

Share this: