Lessons Learned from Indexing Close Word Pairs

Abstract

We describe experiments with proximity-aware ranking functions that use indexing of word pairs. Our goal is to evaluate a method of "mild" pruning of proximity information, which would be appropriate for a moderately loaded retrieval system, e.g., an enterprise search engine. We create an index that includes occurrences of close word pairs, where one of the words is frequent. This allows one to efficiently restore relative positional information for all non-stop words within a certain distance. It is also possible to answer phrase queries promptly. We use two functions to evaluate relevance: a modification of a classic proximity-aware function and a logistic function that includes a linear combination of relevance features. Additionally, we use the spam scores provided by the University of Waterloo.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Nov 01, 2010
Accession Number: ADA547189

Entities

People

Anna Belova
Leonid Boytsov

Lessons Learned from Indexing Close Word Pairs

Abstract

Document Details

Entities

People

Tags

Communities of Interest

DTIC Thesaurus Topics

Fields of Study

Readers