Scientific and Technical Information. Series 2. Information Processes and Systems. Number 6, 1969 (Selected Portions),
Abstract
The methods and results of a study to develop an IR language for automatic systems handling polytechnical documents are described. The descriptor dictionary includes both general and special terms expressed by words and phrases which contributes to better recall and precision figures; it comprises a classified and a lexico-semantic index as well as generic and specific relations tables. The dictionary size is 5,542 descriptors and 3,073 keywords. The document indexing procedure includes the following steps: document content analysis and description by natural words; forming of the search pattern by using the descriptor dictionary. The techniques are described which are applied to the analysis of a document from different conceptual aspects constituting the elements of a formalized model of its condensed content. Conversion into IR language involves the use of the words both from the title and the text of a document. The essentials of the technique used for evaluation of the retrieval efficiency by applying statistical methods are set forth. Tests on a multi-subject collection revealed the possibility of a system operating at 85 percent recall and 7 percent relevance, with a standard deviation of 25 percent. (Author)
Document Details
- Document Type
- Technical Report
- Publication Date
- Feb 17, 1971
- Accession Number
- AD0724977
Entities
People
- Yu. I. Shemakin
Organizations
- National Air and Space Intelligence Center