Lightening the Load of Document Smoothing for Better Language Modeling Retrieval
Abstract
The authors hypothesized that language modeling retrieval would improve if they reduced the need for document smoothing to provide an inverse document frequency-like (IDF) effect. They created inverse collection frequency-weighted (ICF) query models as a tool to partially separate the IDF-like role from document smoothing. Compared to maximum likelihood estimated (MLE) queries, the ICF-weighted queries achieved a 6.4% improvement in mean average precision on description queries. The ICF-weighted queries performed better with less document smoothing than that required by MLE queries. Language modeling retrieval may benefit from a means to separately incorporate an IDF-like behavior outside of document smoothing.
Document Details
- Document Type
- Technical Report
- Publication Date
- Jan 01, 2006
- Accession Number
- ADA448634
Entities
People
- James Allan
- Mark D. Smucker
Organizations
- University of Massachusetts Amherst