The Mixer Corpus of Multilingual, Multichannel Speaker Recognition Data

Abstract

This paper describes efforts to create corpora to support and evaluate systems that perform speaker recognition where channel and language may vary. Beyond the ongoing evaluation of speaker recognition systems, these corpora are aimed at the bilingual and cross channel dimensions. We report on specific data collection efforts at the Linguistic Data Consortium and the research ongoing at the US Federal Bureau of Investigation and MIT Lincoln Laboratories. We cover the design and requirements, the collections and final properties of the corpus integrating discussions of the data preparation, research, technology development and evaluation on a grand scale.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 01, 2004
Accession Number
ADA523534

Entities

People

  • Christopher Cieri
  • David A. B. Miller
  • Hirotaka Nakasone
  • Joseph P. Campbell
  • Kevin Walker

Organizations

  • University of Pennsylvania

Tags

Communities of Interest

  • Autonomy

DTIC Thesaurus Topics

  • Attrition
  • Automatic
  • Computers
  • Consortiums
  • English Language
  • Foreign Languages
  • Governments
  • Identification
  • Language
  • Microphones
  • Mobile Phones
  • Multichannel
  • Recognition
  • Recording Systems
  • Test And Evaluation
  • United States
  • United States Government

Readers

  • Distributed Systems and Data Platform Development
  • Speech Processing/Speech Recognition.
  • Technical Research and Report Writing.

Technology Areas

  • AI & ML
  • AI & ML - Machine Translation