An Integrated Suite of Text and Data Mining Tools - Phase II

Abstract

This report summarizes the results of a three-year SBIR project to develop an integrated suite of text and data mining tools. The goal of this project is to provide tools that can help analysts find connections between requirements (as expressed in requirements, documents, or databases) and open-source research literature. An overall approach is outlined, and a step-by-step overview of the work is presented. The tool suite includes parsers for text data sources, metadata extraction, record combining, entity extraction, data normalization, sub-and cross-dataset analysis, multi-field analysis and visualizations, feature selection, XML importers, and indirect link analysis. A set of recommendations for expanding the use of the tools is presented.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Aug 30, 2005
Accession Number
ADA437173

Entities

People

  • Alan L. Porter
  • Brian S. Minsk
  • Paul R. Frey

Tags

Communities of Interest

  • Autonomy
  • Biomedical
  • C4I
  • Engineered Resilient Systems
  • Ground and Sea Platforms
  • Space
  • Weapons Technologies

DTIC Thesaurus Topics

  • Artificial Intelligence
  • Automata Theory
  • Cognitive Science
  • Computational Science
  • Computer Languages
  • Computer Programming
  • Computer Vision
  • Computers
  • Data Mining
  • Data Processing
  • Information Processing
  • Information Science
  • Information Systems
  • Knowledge Management
  • Natural Language Processing
  • Ontologies
  • Operating Systems

Fields of Study

  • Computer science
  • Engineering

Readers

  • Computational Linguistics
  • Database Systems and Applications
  • Systems Analysis and Design

Technology Areas

  • AI & ML
  • AI & ML - Information Retrieval