2-D Processing of Speech for Multi-Pitch Analysis

Abstract

This paper introduces a two-dimensional (2-D) processing approach for the analysis of multi-pitch speech sounds. Our framework invokes the short-space 2-D Fourier transform magnitude of a narrowband spectrogram, mapping harmonicallyrelated signal components to multiple concentrated entities in a new 2-D space. First, localized time-frequency regions of the spectrogram are analyzed to extract pitch candidates. These candidates are then combined across multiple regions for obtaining separate pitch estimates of each speech-signal component at a single point in time. We refer to this as multi-region analysis (MRA). By explicitly accounting for pitch dynamics within localized time segments, this separability is distinct from that which can be obtained using short-time autocorrelation methods typically employed in state-of-the-art multi-pitch tracking algorithms. We illustrate the feasibility of MRA for multi-pitch estimation on mixtures of synthetic and real speech.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Sep 01, 2009
Accession Number
ADA519574

Entities

People

  • Thomas F. Quatierei
  • Tianyu T. Wang

Organizations

  • Massachusetts Institute of Technology

Tags

Communities of Interest

  • Energy and Power Technologies

DTIC Thesaurus Topics

  • Abstracts
  • Autocorrelation
  • Bandwidth
  • Clustering
  • Contracts
  • Department Of Defense
  • Dynamics
  • Frequency
  • Frequency Bands
  • Governments
  • Hidden Markov Models
  • Markov Models
  • Measurement
  • Narrowband
  • Two Dimensional
  • United States
  • United States Government

Fields of Study

  • Engineering

Readers

  • Adaptive Control and Estimation with Uncertainty in Dynamic Systems.
  • Parasitology and Pharmacology of Malaria.
  • Speech Processing/Speech Recognition.

Technology Areas

  • Space
  • Space - Space Objects