Acoustic-Phonetic Modeling of Non-Native Speech for Language Identification

Abstract

The aim of this paper is to investigate to what extent non native speech may deteriorate language identification (LID) performances and to improve them using acoustic adaptation. Our reference LID system is based on a phonotactic approach. The system makes use of language-independent acoustic models and language-specific phone-based bigram language models. Experiments are conducted on the SQALE test database, which contains recordings from English, Erench and German native speakers, and on the MIST database, which contains non-native speech in the same languages uttered by Dutch speakers. Using 5 seconds of telephone quality speech, language identification error rate amounts to 10% for native speech and to 28% for non-native speech, thus yielding an important increase in error rate in the non-native case. We improve non-native language identification by an adaptation of the acoustic models to the non-native speech.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Aug 01, 2000
Accession Number
ADP010379

Entities

People

  • C. Barras
  • E. Bilinski
  • E. Geoffrois
  • M. Adda-decker
  • R. Wanneroy

Tags

Communities of Interest

  • Energy and Power Technologies

DTIC Thesaurus Topics

  • Databases
  • Decoding
  • Foreign Languages
  • Identification
  • Image Processing
  • Language
  • Materials
  • Message Processing
  • Probability
  • Recognition
  • Sequences
  • Technical Information Centers
  • Test And Evaluation
  • Training

Fields of Study

  • Linguistics

Readers

  • Computational Linguistics
  • Speech Processing/Speech Recognition.