TDT-2002 Topic Tracking at Maryland: First Experiments with the Lemur Toolkit

Abstract

The University of Maryland submitted six topic tracking runs for the 2002 Topic Detection and Tracking evaluation. Two runs were produced using the Lemur language modeling toolkit, the remaining four were produced using a separate system coded in Perl. The Lemur runs outperformed the Perl runs on the required condition because term frequency information was better handled. Two of the Perl runs used native Arabic orthography with two-best translation based on a statistical lexicon, obtaining similar results to those obtained with the Arabic-to-English translations provided with the collection.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Feb 01, 2003
Accession Number
ADA459303

Entities

People

  • Daqing He
  • Douglas W. Oard
  • G. C. Murray
  • Hyuk R. Park
  • Michael Subotin

Organizations

  • University of Maryland

Tags

DTIC Thesaurus Topics

  • Abstracts
  • Automated Speech Recognition
  • Boundaries
  • Computers
  • Continents
  • Detection
  • Equations
  • Failure Analysis
  • Information Retrieval
  • Language
  • Maryland
  • Models
  • Probability
  • Standards
  • Training
  • Translations
  • Universities

Readers

  • Brain and Cognitive Science; Experimental Psychology; Cognitive Neuroscience
  • Computational Linguistics
  • Parallel and Distributed Computing.