Cyclical Conundrum in Complication Prediction

William HymanWilliam A. Hyman
Professor Emeritus, Biomedical Engineering
Texas A&M University, w-hyman@tamu.edu
Read other articles by this author

A popular form of Clinical Decision Support is the prediction of forthcoming complications. I have commented on reports of such systems before, and another example has been posted recently, in this case seeking to predict cardiovascular complications from CT scans. As is common in such systems, predictive power is developed by studying a set of patient data to find those attributes of the patient, and perhaps the procedures they have undergone, in order to learn from the data set and to then use what was learned to make predictions on some new patients. One method is to let the machine “learn” by itself, given of course instructions on how to do this learning. The resulting predictive power will depend in part on the learning structure and on how robust and definitive the learning data set was. In this regard it is important to be able to recognize which new patients are within the domain of the prediction and which are not. It is also important to know how clean the split is between the sub groups of patients. Having been trained on the initial data set the software will then operate on data from new patients. Machine learning applications often include the idea that learning will continue, or be periodically renewed, as new data becomes available. This of course requires a structured approach to when and where such new learning will occur, and how the new learning will be verified and shared.

A fundamental question here is why do you want to have this kind of predictive capability. One reason is the belief that targeted intervention can prevent the complications that have been predicted. This assumes that the necessary interventions are something that would not have occurred had the complication not been predicted, ie after seeing the alert physicians will do something beyond what they would have otherwise done which will in turn prevent the complications. All of this implies that without the alert high risk patients would have received sub-optimal care. It is further assumed that the physicians would not have identified higher risk patients without the software support. A corollary of this is that patients who do not trigger the alert will get a lower level of care that does not prevent the targeted complications. If the complication/no complications split is clean this would be fine. But as the split gets fuzzy some patients still at risk of complications will be assigned to the lower intervention group. Depending on what the interventions are, a patient may or may not want to be in this group. This situation also raises the question of why the higher level of care shouldn’t be given to everyone. One possibility is that the interventions themselves carry risk that the low probability group shouldn’t be exposed to. A less medically sound reason is that the higher level of care requires resources that the system does not want to provide.

Now let us follow the consequences of the alert and the interventions. If it is true that the interventions will prevent the complications, and the physicians will act on the alert, then patients who were predicted to suffer the complications will not do so. This of course is the desired outcome of the prediction. But if the interventions are effective the predictions will be found to be wrong because the intervention will have worked. Will this result in an ongoing pride of prevention, or will it lead to increasing disbelief in the prediction? This might create a Cryed Wolfism in that you told me that there was a high risk of complications but repeatedly these did not occur. Therefore, I am no longer going to believe in your alerts.

If additional machine learning is then allowed it would now be found that the original predictors did not lead to the predicted complications because those complications were prevented. Thus what the software learned originally is no longer true. As a result, the system would update its learning to not alert on those cases for which the interventions have been successful. But now what happens to the treatment given to those patients? If the physicians are no longer alerted they will presumably stop making the extra effort that the earlier alerts stimulated. Thus patients who previously benefited from the higher interventions triggered by the alert will no longer have that benefit. If the assumption about the effectiveness of the alert holds true, then some patients would now be out of the higher intervention group and they may therefore suffer the consequences of the complications that higher intervention would have prevented. Note here that if there were only the A and B versions of the software, version A would trigger complication preventing interventions while version B does not. Thus the “upgrade” from A to B would increase the risk for some patients while possibly decreasing the risk for others.

After some period of time if another round of learning occurred then the patients that went on to complications without the alert of version B would now be captured and they would be returned to alert status, and their complication rate would then be again reduced because of the resultant interventions. This appears to be a cyclical process in which patients are moved into and out of the alert group based on the results of alerts previously given or not given. The underlying problem here is that machined learned predictive pattern recognition is not based on a fundamental understanding of the risk conditions but is instead based on prior input data sets and outcomes. This is curve fitting rather than medical science.