Multimillion Word Data Bases: A Preliminary Report. Volume 2.

Abstract

Cumulative statistics are provided on word distribution and word type for a three million word data base. Consonant clusters, word length, and letter frequencies are given for the traditional natural language portion of the vocabulary. Volume one presents comparable statistics for three different one million word data bases, as well as the statistics for a two millon word corpus.

Document Details

Document Type
Technical Report
Publication Date
Apr 01, 1974
Accession Number
AD0777210

Entities

People

  • Paul H. Klingbiel

Organizations

  • Defense Technical Information Center

Tags

DTIC Thesaurus Topics

  • Consonants
  • Data Science
  • Databases
  • Frequency
  • Information Science
  • Language
  • Linguistics
  • Natural Languages
  • Statistics
  • Vocabulary
  • Words (Language)

Readers

  • Government Contracting/Procurement.
  • Speech Processing/Speech Recognition.