Lessons Learned from Indexing Close Word Pairs
Abstract
We describe experiments with proximity-aware ranking functions that use indexing of word pairs. Our goal is to evaluate a method of "mild" pruning of proximity information, which would be appropriate for a moderately loaded retrieval system, e.g., an enterprise search engine. We create an index that includes occurrences of close word pairs, where one of the words is frequent. This allows one to efficiently restore relative positional information for all non-stop words within a certain distance. It is also possible to answer phrase queries promptly. We use two functions to evaluate relevance: a modification of a classic proximity-aware function and a logistic function that includes a linear combination of relevance features. Additionally, we use the spam scores provided by the University of Waterloo.
Document Details
- Document Type
- Technical Report
- Publication Date
- Nov 01, 2010
- Accession Number
- ADA547189
Entities
People
- Anna Belova
- Leonid Boytsov