PRIS at 2012 TREC Medical Track: Query Expansion, Retrieval and Ranking
Abstract
The official datasets are XML format so we have to parse them before indexing. We choose Lucene as our tool for indexing and searching, we select the Jakarta-commons-Digester (the following we referred to as digester) to parse the xml documents. The xml document is processed by the Digester to be a java object and then we can get the fields that we would use from the java object. In addition, we also process the tag "report_text" in the xml documents so that we can get the age and sexuality information which are very important fields for searching task.
Document Details
- Document Type
- Technical Report
- Publication Date
- Nov 01, 2012
- Accession Number
- ADA581494
Entities
People
- Jiayue Zhang
- Jun Guo
- Lin Lin
- Runnan Liu
- Shudang Diao
- Weiran Xu
- Yukun Li
Organizations
- Beijing University of Posts and Telecommunications