Mathematical Modelling for the Evaluation of Automated Speech Recognition Systems--Research Area 3.3.1 (c)

Abstract

Automated speech recognizers (ASR) are now more often found as components inside other applications than as a stand alone application for transcribing speech word-for-word into text. Statistical pattern recognition techniques allow us to acquire a better task-specific evaluation measure for embedded applications than word error rates (WER), which are used for transcription. Our approach considered two applications of ASR: a decision support software system for meetings, in which a summary of a meeting is audited to record all of the decisions that were taken during the meeting, and a specific entity identification task, in which an intelligence analyst identifies triples of "who," "where" and "when" for each event described in transcribed broadcast news. Both of these resemble typical activities of intelligence analysts in OSINT processing and production applications. We assessed two task evaluation measures. The first fixes the input, and learns to predict human subject performance as the transcript for the input varies in accuracy. This measure is well-suited to developers of ASR systems who wish to measure the effects of modifications they make to their software during development. The second measure does not hold the input fixed, and does not require new human-subject data to be collected for new input.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Jan 07, 2016
Accession Number: AD1008218

Entities

People

Gerald Penn

Organizations

University of Toronto

Mathematical Modelling for the Evaluation of Automated Speech Recognition Systems--Research Area 3.3.1 (c)

Abstract

Document Details

Entities

People

Organizations

Tags

DTIC Thesaurus Topics

Fields of Study

Readers

Technology Areas