A Surfeit of Software Snafus

William HymanWilliam A. Hyman
Professor Emeritus, Biomedical Engineering
Texas A&M University, w-hyman@tamu.edu
Read other articles by this author

We are of course increasingly dependent on software in our professional and personal lives. We are also increasingly familiar with software problems ranging from malfunctions to cybersecurity holes. In healthcare software problems can affect EHRs, medical devices, and hospital systems. Sometimes the problem is that the software just doesn’t do what it is supposed to do, ie there is a defect that is independent of anything that the user did or did not do. In other cases, the issue is usability, or human factors, in which the software might be theoretically capable of doing things correctly but the design has a propensity for “use error”. Here the choice of “use” is deliberate, as opposed to user, since user error seems to have already assigned blame while use error leaves open the question of whether the design, or something else, facilitated the error.

A recent trio of software related issues illustrates the scope of the problem. With respect to EHRs, AHRQ recently hosted a webinar on Assessing Safety Risks Associated With EHRs. The first part of this presentation concerned design and use related CPOE problems which included wrong patient orders, undetected adverse drug interactions, and ineffective “decision support”. The second part focused on general wrong patient errors in the use of EHRs. While such errors might suggest the need for more robust chart verification, and limitations in the number of open charts, this can lead to increased frustration in using the systems. In this regard I once proposed a four step universal error prevention methodology: 1. Adopt a stern stance and face, 2. Push up either sleeve, 3. Form a pointing finger with the exposed hand, 4. Wag finger at listeners while saying firmly “Be careful”. However, like many methodologies, this doesn’t actually work.

The second problem is a recall of the software in a suite of mass spectrometer medical devices that are used for a wide range of substance detection in patient samples. Software in these devices “may lead the devices to display results that do not match the specimens tested”, ie an automated wrong patient error. The FDA has called this a “software defect”. As a side note this issue illustrates the popularity of frequent software updates and version numbers in the deployment of software which sometimes is a result of a release it now, fix it later philosophy. In this case the problem occurred in versions 1.6.1 and 1.6.2 of one package and 3.0, 3.0.1 and 3.0.2 in a second package. If version 3 was preceded by a version 2 that was not recalled, this could suggest that version 2 didn’t have the problem because it was introduced in version 3. Such is progress. Of course it was said the problem will be corrected with new software. I await with interest whether this will be called an “upgrade”. The multiple 3 digit two decimal point version numbers in this recall reminds me of an earlier unrelated medical device software recall in which the FDA argued that there had been so many corrections to the software within the year that although they didn’t know that there were errors in the current version, the manufacturer’s software design process was demonstrably inadequate.

A third software issue is more in the realm of research than direct clinical activity but it involves clinically relevant MRI which is a computational intensive technique. Here computationally intensive means the use of software that for most users does something that the user cannot do, cannot check, and might not understand. A related issue arises in Clinical Decision Support software where the question is can the user understand, and therefore second guess, how the software reached the decisions that it did. Of particular interest in the MRI case is functional MRI (fMRI) which purports to detect which areas of the brain are active during particular thought or sense processes. A recent study found that 15 years of research resulting in 40,000 papers was wrong up to 70% of the time. The ultimate example here, actually reported in 2009, was the fMRI calculated “detection” of brain activity in a dead salmon while it was shown a series of pictures of humans engaged in a variety of activities. Somewhat pejoratively I think, the researchers concluded that the salmon was not actually thinking but that there was instead a computation error. While looking at this story, besides the software issue, I learned the terms “high tech phrenology” and the Britishism “boffin”, which perhaps I should have already known.

Software is no doubt of great value when it is well designed, thoroughly tested and used correctly. But this is not always the case, and perhaps is too often not the case. We should therefore remember the truism “Good software is good but bad software is terrible”. This would be suitable for needlepoint, if anyone still does needlepoint.