Distributed Multisearch and Resource Selection for the TREC Million Query Track

Abstract

A distributed information retrieval system with resource-selection and result-set merging capability was used to search subsets of the GOV2 document corpus for the 2008 TREC Million Query Track. The GOV2 collection was partitioned into host-name subcollections and distributed to multiple remote machines. The Multisearch demonstrations, application restricted each search to a fraction of the available sum-collections that was pre-determined by a resource-selection algorithm. Experiment results from topic-by-topic resource selection and aggregate topic resource selection are compared. The sensitivity of Multisearch retrieval performance to variations in the resource selection algorithm is discussed.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Nov 01, 2008
Accession Number
ADA512712

Entities

People

  • Chris Fallen
  • Greg Newby
  • Kylie Mccormick

Organizations

  • University of Alaska Anchorage

Tags

DTIC Thesaurus Topics

  • Abstracts
  • Algorithms
  • Arctic Regions
  • Chemical Compounds
  • Dimensionality Reduction
  • Distribution Functions
  • Eigenvectors
  • Information Processing
  • Information Retrieval
  • Natural Language Processing
  • Probability
  • Probability Distribution Functions
  • Probability Distributions
  • Standards
  • Vector Spaces
  • Web Service
  • Word Lists

Readers

  • Computer Networking
  • Information Retrieval
  • Regression Analysis.

Technology Areas

  • AI & ML
  • AI & ML - Information Retrieval
  • AI & ML - Machine Learning Algorithms