Investigation on Mandarin Broadcast News Speech Recognition

Abstract

This paper describes the authors' efforts in developing a competitive Mandarin broadcast news speech recognizer. They have successfully incorporated the most popular speech technologies into their system. More importantly, they present two novel algorithms for smoothing pitch features and segmenting Chinese characters into word units. In addition, they propose to borrow the principle of point-wise mutual information for creating a Chinese word lexicon automatically. Their final system achieved a 6.0% character error rate (CER) on dev04 and a 16.0% CER on eval04 with simpler acoustic models, less training data, and simpler decoding architecture compared with other state-of-the-art systems. This system is equally competitive.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 01, 2006
Accession Number
ADA450339

Entities

People

  • Mei-yuh Hwang
  • Takahiro Shinozaki
  • Wen Wang
  • Xin Lei

Organizations

  • University of Washington

Tags

DTIC Thesaurus Topics

  • Adaptive Training
  • Algorithms
  • Applied Computer Science
  • Artificial Intelligence
  • Automated Speech Recognition
  • Computer Vision
  • Decoding
  • Electrical Engineering
  • Engineering
  • Language
  • Laughter
  • Machine Translation
  • National Security
  • Recognition
  • Test Sets
  • Training
  • Word Lists

Fields of Study

  • Computer science

Readers

  • Computational Linguistics
  • Economics
  • Speech Processing/Speech Recognition.

Technology Areas

  • AI & ML
  • AI & ML - Machine Translation