Detailed Phonetic Labeling of Multi-language Database for Spoken Language Processing Applications

Abstract

The main objective of this research was to explore and refine methods for detailed phonetic labeling (English or Russian) and character level labeling (Mandarin). Much of the work involved new front end signal processing methods designed to improve acoustic phonetic representations for speech. This resulted in recognition rates for TIMIT (English) of 74%, among the highest reported in the literature. A complete character recognition system for Mandarin was developed and tested. Character recognition rates as high as 88% were obtained, using an approximately 40 training databases. For the case of Russian, a system for automatically converting Russian to morphemes, a kind of base syllable, was created and tested. A suite of tools for front end processing, automatic forced alignment, and a complete automatic recognition system are described.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Mar 01, 2015
Accession Number
ADA614725

Entities

People

  • Stephen A. Zahorian

Organizations

  • Binghamton University

Tags

Communities of Interest

  • Energy and Power Technologies

DTIC Thesaurus Topics

  • Air Force
  • Air Force Research Laboratories
  • Automated Speech Recognition
  • Coding
  • Computational Science
  • Computer Programming
  • Computers
  • Databases
  • Feature Extraction
  • Hidden Markov Models
  • Information Science
  • Language
  • Operating Systems
  • Regression Analysis
  • Signal Processing
  • Test Methods
  • Two Dimensional

Readers

  • Computational Linguistics
  • Speech Processing/Speech Recognition.
  • Systems Analysis and Design