NATURAL LANGUAGE IN COMPUTER FORM

Abstract

This Memorandum describes a scheme for recording text in computer- usable form in such a way that all meaningful typographical distinctions are represented in a standard way. Provision is made for texts in different languages and different alphabets and for subsidiary material such as parallel translations and comments of interest to users and librarians. The basic set of encoding conventions is indefinitely extensible to accommodate new kinds of material. Very large bodies of data require special facilities, and these have been provided by embedding the text encoding scheme in a general file maintenance system. Computer programs are described which simplify conversion of text from these various sources into the standard format. The final section discusses the problem of printing text which has been recorded in the standard format and describes a flexible program for doing this.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Feb 01, 1965
Accession Number
AD0456948

Entities

People

  • Martin Kay
  • Theodore Ziehe

Organizations

  • RAND Corporation

Tags

DTIC Thesaurus Topics

  • Coding
  • Computer Programming
  • Computer Programs
  • Computers
  • Data Processing
  • Decoding
  • Government Procurement
  • Hard Copy
  • Language
  • Linguistics
  • Magnetic Tape
  • Materials
  • Natural Languages
  • Plastic Explosives
  • Punched Cards
  • Standards
  • Typewriters

Fields of Study

  • Computer science

Readers

  • Database Systems and Applications
  • Library and Information Science
  • Systems Analysis and Design