UTD at TREC 2014: Query Expansion for Clinical Decision Support

Abstract

This paper describes the medical information retrieval (MIR) systems designed by the University of Texas at Dallas (UTD) for clinical decision support (CDS) which were submitted to the TREC 2014. We investigated the impact of various knowledge bases for automatic query expansion in the four officially submitted runs. Each of these systems exploits both Wikipedia and PubMed corpus statistics in order to automatically extract keywords. Extracted keywords were then expanded by relying on structured medical knowledge bases, such as the Unified Medical Language System (UMLS), the Systemized Nomenclature of Medicine { Clinical Terms (SNOMED-CT), and Wikipedia as well as unsupervised distributional representations based Google's Word2Vec deep learning architecture. Our highest scoring submission achieved an inferred AP score of 0.056 and an inferred NDCG score of 0.205.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Nov 01, 2014
Accession Number
ADA618667

Entities

People

  • Sanda M. Harabagiu
  • Travis Goodwin

Organizations

  • University of Texas at Dallas

Tags

Communities of Interest

  • Energy and Power Technologies
  • Materials and Manufacturing Processes

DTIC Thesaurus Topics

  • Artificial Intelligence
  • Artificial Intelligence Computing
  • Computer Languages
  • Computing System Architectures
  • Data Science
  • Formal Languages
  • Information Retrieval
  • Information Science
  • Language
  • Lower Extremity
  • Medical Personnel
  • Models
  • Natural Language Processing
  • Nomenclature
  • Physicians
  • Standards
  • Statistics

Fields of Study

  • Computer science

Readers

  • Computational Linguistics
  • Information Retrieval

Technology Areas

  • AI & ML
  • AI & ML - Information Retrieval