What We Mean When We Talk About Data

Robert M. Califf

Robert M. Califf, M.D., FDA’s Deputy Commissioner for Medical Products and Tobacco

By Robert M. Califf, M.D. and Rachel Sherman, M.D., M.P.H.

Medical care and biomedical research are in the midst of a data revolution. Networked systems, electronic health records, electronic insurance claims databases, social media, patient registries, and smartphones and other personal devices together comprise an immense new set of sources for data about health and healthcare. In addition, these “real-world” sources can provide data about patients in the setting of their environments—whether at home or at work—and in the social context of their lives. Many researchers are eager to tap into these streams in order to provide more accurate and nuanced answers to questions about patient health and the safety and effectiveness of medical products—and to do so quickly, efficiently, and at a lower cost than has previously been possible.

But before we can realize the dramatic potential of the healthcare data revolution, a number of practical, logistical, and scientific challenges must be overcome. And one of the first that must be tackled is the issue of terminology.

Defining Terms

Although “data,” “information,” and “evidence” are often used as if they were interchangeable terms, they are not. Data are best understood as raw measurements of some thing or process. By themselves they are meaningless; only when we add critical context about what is being measured and how do they become information. That information can then be analyzed and combined to yield evidence, which in turn, can be used to guide decision-making. In other words, it’s not enough merely to have data, even very large amounts of it. What we need, ultimately, is evidence that can be applied to answering scientific and clinical questions.

So far, so good. But what do we mean when we talk about “real-world data” or “real-world evidence”?

Rachel Sherman

Rachel Sherman, M.D., M.P.H., FDA’s Associate Deputy Commissioner for Medical Products and Tobacco

Clinical research often takes place in highly controlled settings that may not reflect the day-to-day realities of typical patient care or the life of a patient outside of the medical care system. Further, those who enroll in clinical trials are carefully selected according to criteria that may exclude many patients, especially those who have other diseases, are taking other drugs, or cannot travel to the investigation site. In other words, the data gathered from such studies may not actually depict the “real world” that many patients and care providers will experience—and this could lead to important limitations in our understanding of the effectiveness and safety of medical treatments. Clinicians and patients must be able to relate the results of clinical trials—studies that are done in controlled environments with certain patient populations excluded and which may therefore be challenging to generalize—to their own professional and personal experiences. It seems straightforward, then, to think that studies including a much fuller and more diverse range of individuals and clinical circumstances could ultimately lead to better scientific evidence for application to decisions about use of medical products and healthcare decisions.

But “real-world evidence” has its own issues that must be understood and dealt with carefully. First of all, the vague term “real-world” may imply a closer relationship with the truth—that the real-world measurement is preferable to one taken in a controlled environment. For example, is “real-world” blood pressure data gathered from an individual’s personal device or health app better (e.g., more reliable and accurate) than a blood pressure measurement from a doctor’s office? It could be, because a patient’s blood pressure might be uncharacteristically elevated during a visit to the physician. But at the same time, do we know enough about the data gathered from the patient’s personal device—how accurate is it? Is the patient taking their own blood pressure correctly? What other factors might be affecting it?—to use it for generating evidence? Already we are being reminded of the complexities of potentially relying on data that were gathered for purposes other than the ones for which they were originally intended.

In most cases “real-world evidence” is thought of as reflecting data already collected, i.e., epidemiologic or cohort data that researchers review and analyze retrospectively. Also of interest is whether randomized trials can be conducted in these “real-world” environments. In considering comparisons of treatments, one must always consider the possibility that the treatments were not assigned randomly, but reflected some relevant patient characteristic. This is, of course, the reason for doing randomized clinical trials.

Better Terms for Complex Subjects

There is little doubt that the new sources of data now being opened to researchers, clinicians, and patients hold enormous potential for improving the quality, safety, and efficiency of medical care. But as we work to understand both the promise and pitfalls of far-reaching technological changes, we need a more functional vocabulary for talking about these complex subjects, one that allows us to think about data, information, and evidence in ways that capture multiple dimensions of quality and fitness for purpose (e.g., for appropriate use in regulatory decision making). The incorporation of “real-world evidence”—that is, evidence derived from data gathered from actual patient experiences, in all their diversity— in many ways represents an important step toward a fundamentally better understanding of states of disease and health. As we begin to adapt “real-world data” into our processes for creating scientific evidence, and as we begin to recognize and effectively address their challenges, we are likely to find that the quality of the answers we receive will depend in large part on whether we can frame the questions in a meaningful way.

This article was originally published on FDA Voice and is reprinted here with permission.