Science and Technology Text Mining: Comparative Analysis of the Research Impact Assessment Literature and the Journal of the American Chemical Society

Abstract

This report shows how Database Tomography can be used to derive technical intelligence from the published literature. Database Tomography is a patented system for analyzing large amounts of textual computerized material. It includes algorithms for extracting multi-word phrase frequencies and performing phrase proximity analyses. Phrase frequency analysis provides the pervasive themes of a database, and phrase proximity analysis provides the relationships among the pervasive themes and between the pervasive themes and sub-themes. One potential application of Database Tomography is to obtain the thrusts and interrelationships of a technical field from papers published in the literature within that field. This report provides applications of Database Tomography to analyses of both the non-technical field of Research Impact Assessment (RIA) and the technical field of Chemistry. A database of relevant RIA articles was analyzed to produce characteristics and key features of the RIA field. The journals that carry the most RIA papers, the institutions most active in RIA, the keywords most often specified by the authors, and the authors whose works are cited most frequently are identified. The pervasive themes of RIA are identified through multi-word phrase analyses of the database, and the relationships among the pervasive themes and between the pervasive themes and sub-themes are identified through phrase proximity analyses. A similar process was applied to Chemistry, with the exception that the database was limited to one year's issues of the Journal of the American Chemical Society. Wherever possible, the RIA and Chemistry results were compared. Finally, the conceptual use of Database Tomography to help identify promising research directions is discussed. (11 refs.)

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Aug 15, 2003
Accession Number
ADA416267

Entities

People

  • Darrel R. Toothman
  • Henry J. Eberhart
  • Robert Pellenbarg
  • Ronald Neil Kostoff

Organizations

  • Office of Naval Research

Tags

Communities of Interest

  • Biomedical

DTIC Thesaurus Topics

  • Artificial Intelligence
  • Chemical Engineering
  • Chemistry
  • Composite Materials
  • Computer Science
  • Crystal Structure
  • Data Mining
  • Databases
  • Geography
  • Health Services
  • Information Processing
  • Information Science
  • Magnetic Resonance
  • Materials
  • Materials Science
  • Molecular Dynamics
  • Technical Intelligence

Readers

  • Aerospace Engineering
  • Computational Linguistics
  • Technical Research and Report Writing.