Code-switched English Pronunciation Modeling for Swahili Spoken Term Detection (Pub Version, Open Access)

Abstract

We investigate modeling strategies for English code-switched words as found in a Swahili spoken term detection system. Code switching, where speakers switch language in a conversation, occurs frequently in multilingual environments, and typically deteriorates STD performance. Analysis is performed in the context of the IARPA Babel program which focuses on rapid STD system development for under-resourced languages. Our results show that approaches that specifically target the modeling of code-switched words, significantly improve the detection performance of these words.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
May 03, 2016
Accession Number
AD1040155

Entities

People

  • Charl Van Heerden
  • Daniel Van Niekerk
  • Marelie Davel
  • Neil Kleynhans
  • Rich Schwartz
  • Stavros Tsakalidis
  • William Hartman

Organizations

  • North-West University

Tags

DTIC Thesaurus Topics

  • Automated Speech Recognition
  • Computer Science
  • Computers
  • Department Of Defense
  • Detection
  • Dictionaries
  • Electronic Mail
  • Identification
  • Language
  • Natural Language Processing
  • Sequences
  • South Africa
  • Standards
  • Switching
  • Test And Evaluation
  • Training
  • Word Lists

Fields of Study

  • Computer science

Readers

  • Computational Linguistics
  • Computational Modeling and Simulation
  • Speech Processing/Speech Recognition.