Collection of Spontaneous Speech for the ATIS Domain and Comparative Analyses of Data Collected at MIT and TI

Abstract

As part of our development of a spoken language system in the ATIS domain, we have begun a small-scale effort in collecting spontaneous speech data. Our procedure differs from the one used at Texas Instruments (TI) in many respects, the most important being the reliance on an existing system, rather than a wizard, to participate in data collection. Over the past few months, we have collected over 3,600 spontaneously generated sentences from 100 subjects. This paper documents our data collection process, and makes some comparative analyses of our data with those collected at TI. The advantages as well as disadvantages of this method of data collection will be discussed.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 01, 1991
Accession Number
ADA460594

Entities

People

  • Joseph Polifroni
  • Stephanie Seneff
  • Victor W. Zue

Organizations

  • Massachusetts Institute of Technology

Tags

Communities of Interest

  • Cyber

DTIC Thesaurus Topics

  • Computer Science
  • Computers
  • Contractors
  • Data Sets
  • Databases
  • Demographic Cohorts
  • Department Of Defense
  • Errors
  • Governments
  • Human-Machine Interaction
  • Information Security
  • Language
  • Natural Languages
  • Test And Evaluation
  • Test Sets
  • Training
  • United States

Readers

  • Clinical Trial Research.
  • Computational Linguistics
  • Systems Analysis and Design