A Comparison of Query-by-Example Methods for Spoken Term Detection

Abstract

In this paper we examine an alternative interface for phonetic search, namely query-by-example, that avoids OOV issues associated with both standard word-based and phonetic search methods. We develop three methods that compare query lattices derived from example audio against a standard ngrambased phonetic index and we analyze factors affecting the performance of these systems. We show that the best systems under this paradigm are able to achieve 77% precision when retrieving utterances from conversational telephone speech and returning 10 results from a single query (performance that is better than a similar dictionary-based approach) suggesting significant utility for applications requiring high precision. We also show that these systems can be further improved using relevance feedback: By incorporating four additional queries the precision of the best system can be improved by 13.7% relative. Our systems perform well despite high phone recognition error rates (> 40%) and make use of no pronunciation or letter-to-sound resources.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Sep 01, 2009
Accession Number
ADA514582

Entities

People

  • Christopher White
  • Timothy J. Hazen
  • Wade Shen

Organizations

  • Massachusetts Institute of Technology

Tags

DTIC Thesaurus Topics

  • Algorithms
  • Automated Speech Recognition
  • Department Of Defense
  • Detection
  • Governments
  • Indexes
  • Information Retrieval
  • Language
  • Networks
  • Neurobehavioral Manifestations
  • Precision
  • Probability
  • Probability Distributions
  • Recognition
  • Standards
  • United States
  • United States Government

Fields of Study

  • Computer science

Readers

  • Computational Linguistics
  • Speech Processing/Speech Recognition.