Understanding Tonal Languages

Abstract

This report gives a detailed summary of research work completed under Air Force Research Laboratory (AFRL) grant 56236, over the time period (November 17, 2010 - November 16, 2012). The main objective was to study various methods for Mandarin syllable recognition. Techniques were explored for both base syllable recognition and lexical tone recognition. The RASC863 database, obtained from the Chinese Linguistic Data Consortium was used for experimental work. Basel syllable phone recognition (60 phones) was done with a Hidden Markov Model recognizer. Best results obtained were approximately 69%. Human listeners were used to establish a baseline for lexical tone recognition. Tone recognition accuracy for humans ranges from about 55% to about 90%, depending on how much context is given to the listeners. The best tone classification accuracy with a neural network classifier is about 76%. The best tone recognition accuracy obtained with a Hidden Markov Model recognizer is about 71%. In addition to ASR experiments with Mandarin, basic research on improved pitch tracking, and refinement of spectral/temporal features (DCTCs/DCSCs) was done. It was determined that much longer time intervals are preferred for dynamic feature calculations than are typically used with MFCC features. Also the "best" segment intervals for Mandarin feature calculations are somewhat longer than for English.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Apr 01, 2013
Accession Number
ADA584180

Entities

People

  • Stephen A. Zahorian

Organizations

  • Binghamton University

Tags

Communities of Interest

  • Energy and Power Technologies

DTIC Thesaurus Topics

  • Air Force
  • Air Force Research Laboratories
  • Automated Speech Recognition
  • Computational Science
  • Databases
  • Electrical Engineering
  • Feature Extraction
  • Frequency Response
  • Hidden Markov Models
  • Language
  • Machine Learning
  • Markov Models
  • Military Research
  • Neural Networks
  • Recognition
  • Signal Processing
  • Time Intervals

Readers

  • Speech Processing/Speech Recognition.

Technology Areas

  • AI & ML
  • AI & ML - Machine Translation