Speech Recognition, Articulatory Feature Detection, and Speech Synthesis in Multiple Languages

Abstract

This document provides a summary of work completed by General Dynamics under the work unit 71840871, Speech Interfaces for Multinational Collaboration, for the period August 2004 to February 2009 under contract FA8650-04-C-6443. The speech technologies developed during this period include speech recognizers, Articulatory Feature (AF) detectors, and speech synthesizers. Speech recognition systems were developed for 15 different languages, and three methods were investigated for improving the performance of the systems: vocal tract length normalization, speaker adaptive training, and recognizer output voting error reduction. English AF detectors were developed using Gaussian mixture models, two-class Multi-Layer Perceptrons (MLPs), fusion MLPs, and multi-class MLPs. The outputs of the AF detectors were used to form the feature set for a speech recognizer. Speech synthesis systems were created for 13 different languages, and the following system modifications were investigated: expanding the label set to include additional contextual factors, changing the minimum description length control factor, and applying speaker clustering and adaption to create new voices. In addition, two graphical user interfaces were developed for training new voices and synthesizing speech in real-time.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Nov 01, 2009
Accession Number
ADA519140

Entities

People

  • Brian M. Ore

Organizations

  • General Dynamics

Tags

Communities of Interest

  • Energy and Power Technologies

DTIC Thesaurus Topics

  • Adaptive Training
  • Air Force Research Laboratories
  • Automated Speech Recognition
  • Computer Science
  • Contracts
  • Detection
  • Detectors
  • Graphical User Interface
  • Hidden Markov Models
  • Information Systems
  • Language
  • Markov Models
  • Military Research
  • Probability
  • Recognition
  • Training
  • User Interface

Readers

  • Neural Network Machine Learning.
  • Software Engineering
  • Speech Processing/Speech Recognition.

Technology Areas

  • AI & ML