Dialect Identification

Abstract

The objective of this work was to develop and evaluate a capability to automatically determine the dialect spoken in samples of recorded speech. Language identification (LID) software automatically determines the language spoken in samples of recorded speech. ITTI had developed several LID programs when the testing phase of this effort began, the two most important being a "speaker dependent" and a "speaker independent" version. Both of these programs served as baseline systems. In broad terms, this effort had as one objective the testing of pre-existing LID algorithms on a dialect identification (DID) task, and subsequent development of a baseline system to improve its DID performance as another objective. The testing was specifically required to assess the effect on DID performance as another objective. The testing was specifically required to assess the effect on DID performance (i.e., accuracy) of operating parameters known to be important in tactical applications of speech-related automatic recognition algorithms, including speech segment duration, signal-to-noise ratio (SNR), bandwidth, amount of available dialect sample data, and spectral tilt variations such as those which are introduced by various communications channels. At the outset of the effort, ITTI was directed to consider wholly new approaches to DID, apart from the techniques previously found useful for automatic LID and incorporated in the pre-existing LID programs. Priority was to be given to consideration of the findings of academic dialectologists, by examining the technical dialectology literature in hopes of finding additional, new approaches to automatic DID. A literature survey and analysis of the potential of what was found there for automatic DID was therefore also a high priority objective of this effort.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Apr 01, 1999
Accession Number
ADA365980

Entities

People

  • Alan Higgins
  • Jack Porter
  • K. P. Li
  • Peter Benson

Tags

Communities of Interest

  • C4I
  • Human Systems

DTIC Thesaurus Topics

  • Accuracy
  • Air Force
  • Air Force Research Laboratories
  • Command And Control
  • Data Sets
  • Dimensionality Reduction
  • Feature Extraction
  • Frequency
  • Geography
  • Hispanics
  • Identification
  • Information Systems
  • Language
  • Literature Surveys
  • Recognition
  • Signal Processing
  • United States

Readers

  • Speech Processing/Speech Recognition.
  • Systems Analysis and Design