PRIS at 2012 TREC Medical Track: Query Expansion, Retrieval and Ranking

Abstract

The official datasets are XML format so we have to parse them before indexing. We choose Lucene as our tool for indexing and searching, we select the Jakarta-commons-Digester (the following we referred to as digester) to parse the xml documents. The xml document is processed by the Digester to be a java object and then we can get the fields that we would use from the java object. In addition, we also process the tag "report_text" in the xml documents so that we can get the age and sexuality information which are very important fields for searching task.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Nov 01, 2012
Accession Number: ADA581494

Entities

People

Jiayue Zhang
Jun Guo
Lin Lin
Runnan Liu
Shudang Diao
Weiran Xu
Yukun Li

Organizations

Beijing University of Posts and Telecommunications

PRIS at 2012 TREC Medical Track: Query Expansion, Retrieval and Ranking

Abstract

Document Details

Entities

People

Organizations

Tags

DTIC Thesaurus Topics

Fields of Study

Readers