Query Expansion for Noisy Legal Documents
Abstract
The vocabulary of the TREC Legal OCR collection is noisy and huge. Standard techniques for improving retrieval performance such as content-based query expansion are ineffective for such document collection. In our work, we focused on exploiting metadata using blind relevance feedback, iterative improvement from the reference Boolean run, and the effects of using terms from different topic fields for automatic query formulation. This paper describes our methodologies and results.
Document Details
- Document Type
- Technical Report
- Publication Date
- Nov 01, 2008
- Accession Number
- ADA512690
Entities
People
- Douglas W. Oard
- Lidan Wang
Organizations
- University of Maryland