Effective Retrieval with Distributed Collections
Abstract
This paper evaluates the retrieval effectiveness of distributed information retrieval systems in realistic environments. We find that when a large number of collections are available, the retrieval effectiveness is significantly worse than that of centralized systems, mainly because typical queries are not adequate for the purpose of choosing the right collections. We propose two techniques to address the problem. One is to use phrase information in the collection selection index and the other is query expansion. Both techniques enhance the discriminatory power of typical queries for choosing the right collections and hence significantly improve retrieval results. Query expansion, in particular, brings the effectiveness of searching a large set of distributed collections close to that of searching a centralized collection.
Document Details
- Document Type
- Technical Report
- Publication Date
- Jan 01, 1997
- Accession Number
- ADA341194
Entities
People
- Jamie Callan
- Jinxi Xu
Organizations
- University of Massachusetts Amherst