CLIR Experiments at Maryland for TREC-2002: Evidence Combination for Arabic-English Retrieval

Abstract

The focus of the experiments reported in this paper was techniques for combining evidence for crosslanguage retrieval, searching Arabic documents using English queries. Evidence from multiple sources of translation knowledge was combined to estimate translation probabilities, and four techniques for estimating query-language term weights from document-language evidence were tried. A new technique that exploits translation probability information was found to outperform a comparable technique in which that information was not used. Comparative results for three variants of Arabic light stemming are also presented. A simple variant of an existing stemming algorithm was found to result in significantly better retrieval effectiveness.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 01, 2002
Accession Number
ADA457575

Entities

People

  • Douglas W. Oard
  • Kareem Darwish

Organizations

  • University of Maryland

Tags

Communities of Interest

  • Materials and Manufacturing Processes

DTIC Thesaurus Topics

  • Abstracts
  • Automatic
  • Dictionaries
  • Equations
  • Failure Analysis
  • Frequency
  • Information Operations
  • Language
  • Machine Translation
  • Maryland
  • Natural Languages
  • Probability
  • Stemming
  • Text Processing
  • Translations
  • Universities
  • Word Lists

Fields of Study

  • Computer science

Readers

  • Computational Linguistics