Science and Technology Text Mining: Origins of Database Tomography and Multi-Word Phrase Clustering
Abstract
This report initially describes the motivations for co-word analysis in support of research policy formulation and research implementation evaluation. It compares co-word analysis in relation to other co-occurrence techniques such as co-citation and co-nomination analyses. It then traces the origins of co-word analysis in computational linguistics, describes in detail the development of co-word analysis for research evaluation, and concludes by presenting a new approach to co-word analysis for research evaluation (Database Tomography). The report shows that this new approach to co-word analysis, which requires no index or key words but deals with text directly, is a useful tool for scanning large bodies of text. It can identify pervasive thrust areas and their interrelationships, and serves as a starting point for further in-depth analysis of the text. Its value increases as the size of text increases and the breadth of topical areas covered by the text increases beyond the expertise of a moderate number of expert panels. A single link clustering example is shown that represents the first use of multi-word technical phrases in modern clustering. (75 refs.)
Document Details
- Document Type
- Technical Report
- Publication Date
- Aug 15, 2003
- Accession Number
- ADA416268
Entities
People
- Ronald Neil Kostoff
Organizations
- Office of Naval Research