NATURAL LANGUAGE IN COMPUTER FORM
Abstract
This Memorandum describes a scheme for recording text in computer- usable form in such a way that all meaningful typographical distinctions are represented in a standard way. Provision is made for texts in different languages and different alphabets and for subsidiary material such as parallel translations and comments of interest to users and librarians. The basic set of encoding conventions is indefinitely extensible to accommodate new kinds of material. Very large bodies of data require special facilities, and these have been provided by embedding the text encoding scheme in a general file maintenance system. Computer programs are described which simplify conversion of text from these various sources into the standard format. The final section discusses the problem of printing text which has been recorded in the standard format and describes a flexible program for doing this.
Document Details
- Document Type
- Technical Report
- Publication Date
- Feb 01, 1965
- Accession Number
- AD0456948
Entities
People
- Martin Kay
- Theodore Ziehe
Organizations
- RAND Corporation