What are the Optimal Data Science and Machine Learning Competencies for Informatics Professionals?

Bill Hersh1William Hersh, MD, Professor and Chair, OHSU
Blog: Informatics Professor
Twitter: @williamhersh

Exactly 20 years ago, I organized a panel at the American Medical Informatics Association (AMIA) Annual Symposium that attracted so large an audience that the crowd spilled out of the room into the hallway. Entitled, What are the Optimal Computer Science Competencies for Medical Informatics Professionals?, the panel asked how much knowledge and skills in computer science were required to work professionally in informatics. In the early days of informatics, most informaticians had some programming skills and often contributed to the development of home-grown systems. Some educational programs, such as the one at Stanford University, had required courses in assembly language. (I took an assembler course myself during my informatics fellowship in the late 1980s.)

But as academic informatics systems grew in scope and complexity, they needed more engineering and hardening as they became mission-critical to organizations. At the same time, there was a recognized need for attention to people and organizational issues, especially in complex adaptive settings such as hospitals. Over time, most professional work in informatics has shifted from system building to implementing commercial systems.

With these changes, my evolving view has been that although few informatics professionals perform major computer programming, there is still value to understanding the concepts and thought process of computer science. While plenty of students enter our graduate program at Oregon Health & Science University with programming skills, our program will not turn those without programming skills into seasoned programmers. But I still believe it is important for all informatics professionals to understand the science of computing, even at the present time. This includes some programming to see computing concepts in action.

A couple decades later, I find myself asking a related question, which is, how much data science and machine learning is required of modern informatics professionals? Clearly data science, machine learning, artificial intelligence, etc. are very prominent now in the evolution healthcare and biomedical science. But not everyone needs to be a “deep diver” into data science and machine learning. I often point this out by referring to the data analytics workforce reports from a few years ago that note the need for a five- to ten-fold ring of people who identify the needs, put into practice, and communicate the results of the deep divers ¹, ². I also note the observation of data analytics thought leader Tom Davenport, who has written the importance of the roles of “light quants” or “analytical translators” in data-driven organizations (such as healthcare)³.

Thus to answer my question in the title of this post, competence in data science and machine learning may be analogous to the answer to the computer science question of a couple decades ago. Clearly, every informatician must have basic data science skills. These include knowing how to gather, wrangle, and carry out basic analysis of data. They should understand the different approaches to machine learning, even if they do not necessarily understand all of their deep mathematics. And of course they must critically know how to apply data science and machine learning in their everyday professional practice of informatics.


¹ Manyika, J, Chui, M, et al. (2011). Big data: The next frontier for innovation, competition, and productivity, McKinsey Global Institute.
² Anonymous (2014). IDC Reveals Worldwide Big Data and Analytics Predictions for 2015. Framingham, MA, International Data Corporation.
³ Davenport, T (2015). In praise of “light quants” and “analytical translators”. Deloitte Insights.

This article post first appeared on The Informatics Professor. Dr. Hersh is a frequent contributing expert to HITECH Answers.