Informedia News-On Demand: Using Speech Recognition to Create a Digital Video Library

Abstract

In theory, speech recognition technology can make any spoken words in video or audio media usable for text indexing, search and retrieval. This article describes the News-on-Demand application created within the Informedia(TM) Digital Video Library project and discusses how speech recognition is used in transcript creation from video, alignment with closed-captioned transcripts, audio paragraph segmentation and a spoken query interface. Speech recognition accuracy varies dramatically depending on the quality and type of data used. Informal information retrieval tests show that reasonable recall and precision can be obtained with only moderate speech recognition accuracy.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Mar 19, 1998
Accession Number
ADA350404

Entities

People

  • Alexander G. Hauptmann
  • Howard D. Wactlar
  • Michael J. Witbrock

Organizations

  • Carnegie Mellon University

Tags

Communities of Interest

  • Energy and Power Technologies

DTIC Thesaurus Topics

  • Accuracy
  • Automated Speech Recognition
  • Computer Science
  • Databases
  • Digital Video
  • Image Processing
  • Information Retrieval
  • Language
  • Markov Models
  • Models
  • Natural Language Processing
  • Natural Languages
  • Probability
  • Recognition
  • Speech Analysis
  • Video
  • Vocabulary

Fields of Study

  • Computer science

Readers

  • Geospatial Intelligence and Artificial Intelligence Analytics
  • Speech Processing/Speech Recognition.
  • Systems Analysis and Design

Technology Areas

  • AI & ML
  • AI & ML - Information Retrieval
  • AI & ML - Machine Translation