TWO DICTIONARY TRANSCRIPTS AND PROGRAMS FOR PROCESSING THEM. VOLUME I. THE ENCODING SCHEME, PARSENT AND CONIX.

Abstract

The encoding scheme used in transcribing Webster's Seventh New Collegiate Dictionary (W7) and The New Merriam-Webster Pocket Dictionary (MPD) onto magnetic tape is described in full detail. Each dictionary is available in two versions: a typographically faithful encoding, referred to as the 'unparsed' transcripts, and the results of processing those transcripts by PARSENT, a program which analyzes each dictionary entry into such parts as the pronunciation, the part-of-speech (or functional) label, the etymology, sense division markers (or dividers), definitions, verbal illustrations, usage notes, synonymy paragraphs, etc. The PARSENT output is referred to as the 'parsed' transcripts. The formats of both the unparsed and parsed transcripts are described in full detail, as is the format of the concordance index obtained for each dictionary by processing its parsed transcript with the CONIX program. (Author)

Document Details

Document Type
Technical Report
Publication Date
Jun 15, 1969
Accession Number
AD0691098

Entities

People

  • James Paris
  • John Olney
  • Richard Reichert

Organizations

  • System Development Corporation

Tags

DTIC Thesaurus Topics

  • Coding
  • Dictionaries
  • Magnetic Tape
  • Tapes

Readers

  • Computational Linguistics