Bottle-Neck Feature Extraction Structures for Multilingual Training and Porting (Pub Version, Open Access)

Abstract

Stacked-Bottle-Neck (SBN) feature extraction is a crucial part of modern automatic speech recognition (ASR) systems. The SBN network traditionally contains a hidden layer between the BN and output layers. Recently, we have observed that an SBN architecture without this hidden layer (i.e. direct BN-layer output-layer connection) performs better for a single language but fails in scenarios where a network pre-trained in multilingual fashion is ported to a target language. In this paper, we describe two strategies allowing the direct-connection SBN network to indeed benefit from pre-training with a multilingual net: (1) pre-training multilingual net with the hidden layer which is discarded before porting to the target language and (2) using only the direct- connection SBN with triphone targets both in multilingual pre-training and porting to the target language. The results are reported on IARPA-BABEL limited language pack (LLP) data.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
May 03, 2016
Accession Number
AD1040156

Entities

People

  • Frantisek Grezl
  • Martin Karafiat

Organizations

  • Brno University of Technology

Tags

Communities of Interest

  • Energy and Power Technologies
  • Materials and Manufacturing Processes

DTIC Thesaurus Topics

  • Automated Speech Recognition
  • Coefficients
  • Computer Languages
  • Computer Science
  • Computer Vision
  • Computers
  • Czech Republic
  • Department Of Defense
  • Extraction
  • Feature Extraction
  • Hierarchies
  • Language
  • Natural Language Processing
  • Neural Networks
  • Recognition
  • Topology
  • Training

Fields of Study

  • Computer science

Readers

  • Marksmanship and Weaponry.
  • Parallel and Distributed Computing.
  • Speech Processing/Speech Recognition.

Technology Areas

  • AI & ML
  • AI & ML - Machine Translation
  • AI & ML - Neural Networks