NEGATIONS NOT SOLVED: GENERALIZABILITY VERSUS OPTIMIZABILITY IN CLINICAL NATURAL LANGUAGE PROCESSING

Abstract

A review of published work in clinical natural language processing (NLP) may suggest that the negation detection task has been solved. This work contends that an optimizable solution does not equal a generalizable solution. Using four manually annotated corpora of clinical text, we show that negation detection can be optimized in relatively constrained settings, but performance is not reliably generalizable unless in-domain training data is available in which case fully supervised domain adaptation techniques may prove effective. Various factors (e.g., annotation guidelines, named entity characteristics, the amount of data, and lexical and syntactic context) play a role in making generalizability difficult, but none completely explains the phenomenon. This indicates the need for future work in domain-adaptive and task-adaptive methods for clinical NLP.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 01, 2013
Accession Number
AD1107234

Entities

People

  • Cheryl Clark
  • David Carrell
  • James Masanz
  • Matt Coarr
  • Scott Halgrim
  • Stephen Wu
  • Timothy M Miller

Organizations

  • MITRE Corporation

Tags

Communities of Interest

  • Biomedical

DTIC Thesaurus Topics

  • Breast Cancer
  • Computer Programs
  • Diseases And Disorders
  • Health Services
  • Information Science
  • Language
  • Machine Learning
  • Medical Personnel
  • Natural Language Processing
  • Natural Languages
  • Neoplasms
  • Ontologies
  • Supervised Machine Learning
  • Test Sets
  • Vascular Diseases
  • X Rays

Readers

  • Computational Linguistics
  • Neural Network Machine Learning.

Technology Areas

  • AI & ML
  • AI & ML - Information Retrieval
  • AI & ML - Machine Learning Algorithms
  • AI & ML - Machine Translation