Pocketsphinx: A Free, Real-Time Continuous Speech Recognition System for Hand-Held Devices

Abstract

The availability of real-time continuous speech recognition on mobile and embedded devices has opened up a wide range of research opportunities in human-computer interactive applications. Unfortunately, most of the work in this area to date has been confined to proprietary software, or has focused on limited domains with constrained grammars. In this paper, we present a preliminary case study on the porting and optimization of CMU SPHINX-II, a popular open source large vocabulary continuous speech recognition (LVCSR) system, to hand-held devices. The resulting system operates in an average 0.87 times real-time on a 206MHz device, 8.03 times faster than the baseline system. To our knowledge, this is the first hand-held LVCSR system available under an open-source license.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 01, 2006
Accession Number
ADA456834

Entities

People

  • Alan W. Black
  • Alex I. Rudnicky
  • Arthur Chan
  • David Huggins-daines
  • Mohit Kumar
  • Mosur Ravishankar

Organizations

  • Carnegie Mellon University

Tags

Communities of Interest

  • Energy and Power Technologies

DTIC Thesaurus Topics

  • Abstracts
  • Algorithms
  • Automated Speech Recognition
  • Compilers
  • Computations
  • Computer Programming
  • Computer Programs
  • Computers
  • Floating Point Operations
  • Grammars
  • Language
  • Linguistics
  • Natural Language Processing
  • Operating Systems
  • Optimization
  • Recognition
  • Vocabulary

Readers

  • Database Systems and Applications
  • Distributed Systems and Data Platform Development
  • Speech Processing/Speech Recognition.

Technology Areas

  • AI & ML