Natural Language Information Retrieval: Tipster-2 Final Report

Abstract

We report on the joint GE/NYU natural language information retrieval project as related to the Tipster Phase 2 research conducted initially at NYU and subsequently at GE R&D Center and NYU. The evaluation results discussed here were obtained in connection with the 3rd and 4th Text Retrieval Conferences (TREC-3 and TREC-4). The main thrust of this project is to use natural language processing techniques to enhance the effectiveness of full-text document retrieval. During the course of the four TREC conferences, we have built a prototype IR system designed around a statistical full-text indexing and search backbone provided by the NIST's Prise engine. The original Prise has been modified to allow handling of multi-word phrases, differential term weighting schemes, automatic query expansion, index partitioning and rank merging, as well as dealing with complex documents. Natural language processing is used to preprocess the documents in order to extract content-carrying terms, discover inter-term dependencies and build a conceptual hierarchy specific to the database domain, and process user's natural language requests into effective search queries.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 01, 1996
Accession Number
ADA460459

Entities

People

  • Tomek Strzalkowski

Tags

DTIC Thesaurus Topics

  • Abstracts
  • Artificial Intelligence
  • Automatic
  • Base Lines
  • Computational Processes
  • Data Analysis
  • Databases
  • Frequency
  • Hot Spots
  • Information Retrieval
  • Language
  • Natural Language Processing
  • Natural Languages
  • Precision
  • Programming Languages
  • South Africa
  • Test And Evaluation

Fields of Study

  • Computer science

Readers

  • Computational Linguistics

Technology Areas

  • AI & ML
  • AI & ML - Information Retrieval