Real-Time Speaker Detection for User-Device Binding

Abstract

This thesis explores the accuracy and utility of a framework for recognizing a speaker by his or her voice called the Modular Audio Recognition Framework (MARF). Accuracy was tested with respect to the MIT Mobile Speaker corpus along three axes: 1) number of training sets per speaker, 2) testing sample length and 3) environmental noise. Testing showed that the number of training samples per speaker had little impact on performance. It was also shown that MARF was successful using testing samples as short as 1000ms. Finally, testing discovered that MARF had difficulty with testing samples containing significant environmental noise. An application of MARF, namely a referentially-transparent calling service, is described. Use of this service is considered for both military and civilian applications, specifically for use by a Marine platoon or a disaster-response team. Limitations of the service and how it might benefit from advances in hardware are outlined.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Dec 01, 2010
Accession Number
ADA536427

Entities

People

  • Mark J. Bergem

Organizations

  • Naval Postgraduate School

Tags

Communities of Interest

  • Biomedical
  • Energy and Power Technologies

DTIC Thesaurus Topics

  • Authentication
  • Automated Speech Recognition
  • Cellular Networks
  • Communication Systems
  • Computer Programming
  • Computer Programs
  • Computer Science
  • Computers
  • Digital Signal Processing
  • Feature Extraction
  • Mobile Communications
  • Mobile Devices
  • Mobile Phones
  • Network Science
  • Operating Systems
  • Pattern Recognition
  • Smartphones

Readers

  • Distributed Systems and Data Platform Development
  • Speech Processing/Speech Recognition.