Vowel System Modeling: A Complement to Phonetic Modeling in Language Identification

Abstract

Most systems of Automatic Language Identification are based on phonotactic approaches. However, it is more and more evident that taking other features (phonetic, phonological, prosodic, etc.) into account will improve performances. This paper presents an unsupervised phonetic approach that aims to consider phonological cues related to the structure of vocalic and consonantal systems. In this approach, unsupervised vowel/non vowel detection is used to model separately vocalic and consonantal systems. These Gaussian Mixture Models are initialized with a data-driven variant of the LBG algorithm: the LBG-Rissanen algorithm. With 5 languages from the OGI MLTS corpus and in a closed set identification task, the system reaches 85 % of correct identification using 45-second duration utterances for male speakers. Using the vowel system modeling as a complement to an unsupervised phonetic modeling increases this performance up to 91 % while still requiring no labeled data.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Aug 01, 2000
Accession Number: ADP010396

Entities

People

Francois Pellegrino
Jerome Farinas
Regine Andre-obrechi

Vowel System Modeling: A Complement to Phonetic Modeling in Language Identification

Abstract

Document Details

Entities

People

Tags

Communities of Interest

DTIC Thesaurus Topics

Readers