Indri at TREC 2004: Terabyte Track

Abstract

This paper provides an overview of experiments carried out at the TREC 2004 Terabyte Track using the Indri search engine. Indri is an efficient, effective distributed search engine. Like INQUERY, it is based on the inference network framework and supports structured queries, but unlike INQUERY, it uses language modeling probabilities within the network which allows for added flexibility. We describe our approaches to the Terabyte Track, all of which involved automatically constructing structured queries from the title portions of the TREC topics. Our methods use term proximity information and HTML document structure. In addition, a number of optimization procedures for efficient query processing are explained.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 01, 2004
Accession Number
ADA460153

Entities

People

  • Donald Metzler
  • Howard Turtle
  • Trevor Strohman
  • W. Bruce Croft

Organizations

  • University of Massachusetts Amherst

Tags

DTIC Thesaurus Topics

  • Computer Science
  • Information Retrieval
  • Language
  • Models
  • Natural Languages
  • Neoplasms
  • Networks
  • Operating Systems
  • Optimization
  • Precision
  • Probability
  • Prostate
  • Prostate Cancer
  • Statistics
  • Terabytes
  • Trees (Data Structures)
  • Vocabulary

Fields of Study

  • Computer science

Readers

  • Computational Linguistics
  • Distributed Systems and Data Platform Development

Technology Areas

  • AI & ML
  • AI & ML - Information Retrieval
  • AI & ML - Machine Learning Algorithms