Empirical evaluation of language modeling to ascertain cancer outcomes from clinical text reports

Abstract

Longitudinal data on key cancer outcomes for clinical research, such as response to treatment and disease progression, are not captured in standard cancer registry reporting. Manual extraction of such outcomes from unstructured electronic health records is a slow, resource-intensive process. Natural language processing (NLP) methods can accelerate outcome annotation, but they require substantial labeled data. Transfer learning based on language modeling, particularly using the Transformer architecture, has achieved improvements in NLP performance. However, there has been no systematic evaluation of NLP model training strategies on the extraction of cancer outcomes from unstructured text.

Document Details

Document Type
Pub Defense Publication
Publication Date
Sep 02, 2023
Source ID
10.1186/s12859-023-05439-1

Entities

People

  • Deborah Schrag
  • Eliezer M. Van Allen
  • Haitham A. Elmarakeby
  • Irbaz Bin Riaz
  • Kenneth L. Kehl
  • Pavel S. Trukhanov
  • Vidal M. Arroyo

Organizations

  • Doris Duke Charitable Foundation
  • National Cancer Institute
  • Prostate Cancer Foundation
  • United States Department of Defense

Tags

Readers

  • Agent-Based Social Robotics and Mobile-Assisted Learning in Virtual Environments.
  • Computational Linguistics
  • Mental Health of Military Veterans with Posttraumatic Stress Disorder (PTSD): Risk Factors, Prevalence, Symptoms, and Treatment.

Technology Areas

  • AI & ML
  • AI & ML - Information Retrieval
  • AI & ML - Machine Translation
  • AI & ML - Neural Networks
  • Microelectronics