RAND Corporation Data in Systran. Volume 2.

Abstract

NTS SOME EMPIRICAL LINGUISTIC FINDINGS BASED ON A MILLION-WORD Russian corpus with syntactic annotations. The corpus, consisting of Russian mathematics, physics, cybernetics, astrobotany and physiology, has been produced by the Rand Corp., Santa Monica, California and converted for use by SYSTRAN language-analysis processing procedures. Since all syntagmas are explicitly marked in the Rand data base, little or no contextual reference is necessary in order to establish semosyntactic relationships that may be utilized as the most essential components of an automatic parser for S+T text. Volume II deals with text statistics, the bulk of which is high-frequency wordlists in descending frequency order as well as alphabetical order for both individual and combined subject matters. (Modified author abstract)

Document Details

Document Type: Technical Report
Publication Date: Aug 01, 1973
Accession Number: AD0769560

Entities

People

Ludek A. Kozlik
Peter P. Toma

RAND Corporation Data in Systran. Volume 2.

Abstract

Document Details

Entities

People

Tags

DTIC Thesaurus Topics

Readers