Cohort Shepherd II: Verifying Cohort Constraints from Hospital Visits

Abstract

This paper describes the updated system created by the University of Texas at Dallas for content-based medical record retrieval submitted to the TREC 2012 Medical Records Track. Our system updates our work from the previous year by building a structured query for each cohort that captures the patient's age, gender hospital status, and medical assertion information. Further, all keywords that encode any medical phenomena from the query are recursively decomposed before being expanded using knowledge from UMLS SNOMED, Wikipedia, and PubMed co-occurrences. An initial ranking of hospital visits is then obtained using BM25 relevance on an interpolation of these decomposed keywords. Finally, hospital visits are re-ranked according to the constraints extracted in the structured query. Four runs were submitted, comparing pair-wise combinations of complete vs. shallow keyword decomposition and full vs. negation-only assertion processing. Our highest scoring submission achieved an infNDCG score of 0.426.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Nov 01, 2012
Accession Number
ADA581518

Entities

People

  • Bryan Rink
  • Kirk Roberts
  • Sanda M. Harabagiu
  • Travis Goodwin

Organizations

  • University of Texas at Dallas

Tags

DTIC Thesaurus Topics

  • Acid-Base Equilibrium
  • Breast Cancer
  • Decomposition
  • Detection
  • Emergencies
  • Health Services
  • Hospitals
  • Language
  • Lower Extremity
  • Magnetic Resonance
  • Natural Language Processing
  • Neoplasms
  • Prostate Cancer
  • Standards
  • Supervised Machine Learning
  • Therapy
  • Universities

Fields of Study

  • Engineering

Readers

  • Clinical Trial Research.
  • Information Retrieval
  • Trauma or Military Medicine