ON CHARACTER SET REDUCTION,

Abstract

Character sets of languages can be reduced by representing several different characters by one in case this does not cause ambiguities. For finite-state languages the best reductions can always be found, whereas for context-free languages the problem is, in general, unsolvable. Easily invertible reductions are obtained, however, if the language is embedded into a finite-state language. A method for this is suggested which uses only information on the possible adjacent character pairs. The parsing problem of the reduced language can be returned to that of the original. The implications of this to precedence grammars are discussed. Finally, results of some programs for character set reductions are demonstrated. (Author)

Document Details

Document Type
Technical Report
Publication Date
Aug 01, 1965
Accession Number
AD0473759

Entities

People

  • Reino Kurki-suonio

Organizations

  • Carnegie Institute of Technology

Tags

DTIC Thesaurus Topics

  • Ambiguity
  • Grammars
  • Language
  • Linguistics
  • Personality

Fields of Study

  • Computer science

Readers

  • Computational Linguistics
  • Mathematical Modeling and Probability Theory.