Empirical evaluation of language modeling to ascertain cancer outcomes from clinical text reports

Abstract

Longitudinal data on key cancer outcomes for clinical research, such as response to treatment and disease progression, are not captured in standard cancer registry reporting. Manual extraction of such outcomes from unstructured electronic health records is a slow, resource-intensive process. Natural language processing (NLP) methods can accelerate outcome annotation, but they require substantial labeled data. Transfer learning based on language modeling, particularly using the Transformer architecture, has achieved improvements in NLP performance. However, there has been no systematic evaluation of NLP model training strategies on the extraction of cancer outcomes from unstructured text.

Document Details

Document Type: Pub Defense Publication
Publication Date: Sep 02, 2023
Source ID: 10.1186/s12859-023-05439-1

Entities

People

Deborah Schrag
Eliezer M. Van Allen
Haitham A. Elmarakeby
Irbaz Bin Riaz
Kenneth L. Kehl
Pavel S. Trukhanov
Vidal M. Arroyo

Organizations

Doris Duke Charitable Foundation
National Cancer Institute
Prostate Cancer Foundation
United States Department of Defense

Empirical evaluation of language modeling to ascertain cancer outcomes from clinical text reports

Abstract

Document Details

Entities

People

Organizations

Tags

Readers

Technology Areas