ENRICHing medical imaging training sets enables more efficient machine learning

Abstract

Deep learning (DL) has been applied in proofs of concept across biomedical imaging, including across modalities and medical specialties. Labeled data are critical to training and testing DL models, but human expert labelers are limited. In addition, DL traditionally requires copious training data, which is computationally expensive to process and iterate over. Consequently, it is useful to prioritize using those images that are most likely to improve a model’s performance, a practice known as instance selection. The challenge is determining how best to prioritize. It is natural to prefer straightforward, robust, quantitative metrics as the basis for prioritization for instance selection. However, in current practice, such metrics are not tailored to, and almost never used for, image datasets.

Document Details

Document Type
Pub Defense Publication
Publication Date
Apr 10, 2023
Source ID
10.1093/jamia/ocad055

Entities

People

  • Erin Chinn
  • Ramy Arnaout
  • Rima Arnaout
  • Rohit Arora

Organizations

  • American Heart Association
  • Beth Israel Deaconess Medical Center
  • Gordon and Betty Moore Foundation
  • National Heart, Lung, and Blood Institute
  • National Institute of Allergy and Infectious Diseases
  • National Institutes of Health
  • United States Department of Defense
  • University of California, San Francisco

Tags

Fields of Study

  • Computer science

Readers

  • Computational Linguistics
  • Instructional Design and Training Evaluation.
  • Medical Imaging.

Technology Areas

  • AI & ML
  • AI & ML - Machine Learning Algorithms
  • AI & ML - Neural Networks
  • Biotechnology