Making AI More Trustworthy

Introduction to Medical Imaging and AI

The ambiguity in medical imaging can present major challenges for clinicians who are trying to identify disease. For instance, in a chest X-ray, pleural effusion, an abnormal buildup of fluid in the lungs, can look very much like pulmonary infiltrates, which are accumulations of pus or blood. An artificial intelligence model could assist the clinician in X-ray analysis by helping to identify subtle details and boosting the efficiency of the diagnosis process.

The Challenge of AI Predictions

Because so many possible conditions could be present in one image, the clinician would likely want to consider a set of possibilities, rather than only having one AI prediction to evaluate. One promising way to produce a set of possibilities, called conformal classification, is convenient because it can be readily implemented on top of an existing machine-learning model. However, it can produce sets that are impractically large.

Improving Conformal Classification

MIT researchers have now developed a simple and effective improvement that can reduce the size of prediction sets by up to 30 percent while also making predictions more reliable. Having a smaller prediction set may help a clinician zero in on the right diagnosis more efficiently, which could improve and streamline treatment for patients. This method could be useful across a range of classification tasks — say, for identifying the species of an animal in an image from a wildlife park — as it provides a smaller but more accurate set of options.

How it Works

"With fewer classes to consider, the sets of predictions are naturally more informative in that you are choosing between fewer options. In a sense, you are not really sacrificing anything in terms of accuracy for something that is more informative," says Divya Shanmugam PhD ’24, a postdoc at Cornell Tech who conducted this research while she was an MIT graduate student. The researchers applied a technique developed to improve the accuracy of computer vision models called test-time augmentation (TTA). TTA creates multiple augmentations of a single image in a dataset, perhaps by cropping the image, flipping it, zooming in, etc.

Prediction Guarantees

AI assistants deployed for high-stakes tasks, like classifying diseases in medical images, are typically designed to produce a probability score along with each prediction so a user can gauge the model’s confidence. For instance, a model might predict that there is a 20 percent chance an image corresponds to a particular diagnosis, like pleurisy. But it is difficult to trust a model’s predicted confidence because much prior research has shown that these probabilities can be inaccurate. With conformal classification, the model’s prediction is replaced by a set of the most probable diagnoses along with a guarantee that the correct diagnosis is somewhere in the set.

Maximizing Accuracy

To apply TTA, the researchers hold out some labeled image data used for the conformal classification process. They learn to aggregate the augmentations on these held-out data, automatically augmenting the images in a way that maximizes the accuracy of the underlying model’s predictions. Then they run conformal classification on the model’s new, TTA-transformed predictions. The conformal classifier outputs a smaller set of probable predictions for the same confidence guarantee.

Results and Future Work

Compared to prior work in conformal prediction across several standard image classification benchmarks, their TTA-augmented method reduced prediction set sizes across experiments, from 10 to 30 percent. Importantly, the technique achieves this reduction in prediction set size while maintaining the probability guarantee. The researchers also found that, even though they are sacrificing some labeled data that would normally be used for the conformal classification procedure, TTA boosts accuracy enough to outweigh the cost of losing those data.

Conclusion

The researchers have developed a simple and effective improvement to conformal classification that can reduce the size of prediction sets by up to 30 percent while also making predictions more reliable. This method could be useful across a range of classification tasks and has the potential to improve the efficiency and accuracy of disease diagnosis.

FAQs

Q: What is conformal classification?
A: Conformal classification is a technique that produces a set of possible predictions, rather than a single prediction, along with a guarantee that the correct prediction is somewhere in the set.
Q: What is test-time augmentation (TTA)?
A: TTA is a technique that creates multiple augmentations of a single image in a dataset, perhaps by cropping the image, flipping it, zooming in, etc.
Q: How does the TTA-augmented method improve conformal classification?
A: The TTA-augmented method reduces the size of prediction sets by up to 30 percent while maintaining the probability guarantee, making predictions more reliable and informative.
Q: What are the potential applications of this method?
A: The method could be useful across a range of classification tasks, such as identifying the species of an animal in an image from a wildlife park, and has the potential to improve the efficiency and accuracy of disease diagnosis.