The Application of Information Mining Technology to the Army Injury and Health Outcomes Database

Abstract

In the first half of the grant period we faced many challenges, principally administrative in nature, and mostly related primarily to delays in the procurement of equipment, software, and the awarding of contracts. A significant amount of training was accomplished, and while the learning curve was steep, several individuals received sufficient basic instruction to at least initiate the various processes needed to successfully accomplish the objectives of the proposal. In addition, a special on-site block of instruction was arranged with SAS Institute. This instruction, known as a "pilot" was divided into two parts; a Warehouse Administrator (WA) pilot, and an Enterprise Miner (EM) pilot. Each pilot consisted of 5 full days of training and hands-on assistance. The WA pilot was designed to build the TAIHOD data warehouse and to provide instruction to the designated TAIHOD Warehouse Administrator (Ms. Yore). The EM pilot, was designed to walk several members of the TAIHOD staff through the process of a data mining exercise using data from the warehouse in the TAIHOD environment (Ms. Yore, Mr. Schneider, and LTC Amoroso). To get the full benefit of the mining pilot, it was necessary to select a topic suitable for demonstrating the various idiosyncrasies and capabilities of the software. We chose a practical, though arguably esoteric, exercise to examine data quality. Specifically, we attempted to identify patterns of gender misclassification across the various components of the database. This exercise had the potential utility of allowing us to find patterns of error in the database that, if corrected, could greatly improve the precision of certain analyses where there is ambiguity in the process of matching records. It also involved the use of multiple components of the database, including text (names). Several documents attached to this report provide substantial detail on this very much still-in-progress effort.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Oct 01, 2001
Accession Number
ADA395566

Entities

People

  • Paul J. Amoroso

Organizations

  • United States Army Research Institute of Environmental Medicine

Tags

Communities of Interest

  • Biomedical
  • Human Systems

DTIC Thesaurus Topics

  • Algorithms
  • Army Personnel
  • Computer Programming
  • Computer Science
  • Computers
  • Data Mining
  • Data Sets
  • Databases
  • Enlisted Personnel
  • Information Retrieval
  • Information Science
  • Knowledge Management
  • Military Operations
  • Network Science
  • Predictive Modeling
  • Statistics
  • Surveys

Readers

  • Distributed Systems and Data Platform Development
  • Instructional Design and Training Evaluation.
  • Systems Analysis and Design

Technology Areas

  • AI & ML