Focusing on the Data in Data Mining: Lessons From Recent Experience

Abstract

The use of data mining is growing rapidly. The number of data mining consultants, as well as the number of commercial tools available to the "non-expert" user, are also quickly increasing. It is becoming easier than ever to collect datasets and apply data mining tools to them. As more and more non-experts seek to exploit this technology to help with their business, it becomes increasingly important that they understand the underlying assumptions and biases of these tools. There are a number of factors to consider before applying data mining to a database. In particular, there are important issues regarding the data which should be examined before proceeding with the data mining process. While these issues may be well-known to the data mining expert, the non- expert is often unaware of their importance. In this paper, we will focus on three specific issues, and illustrate each through the use of examples taken from our recent experiences. For each issue, we provide insight into how it might be problematic and suggest techniques for approaching such situations.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Nov 01, 1997
Accession Number
AD1108108

Entities

People

  • Earl Harris
  • Eric Bloedorn
  • Neal J. Rothleder

Organizations

  • MITRE Corporation

Tags

Communities of Interest

  • Air Platforms
  • Autonomy
  • Materials and Manufacturing Processes

DTIC Thesaurus Topics

  • Accident Investigations
  • Accidents
  • Aircrafts
  • Airplanes
  • Algorithms
  • Artificial Intelligence
  • Aviation Safety
  • Classification
  • Computer Languages
  • Computer Science
  • Data Mining
  • Databases
  • Inspection
  • Law Enforcement
  • Machine Learning
  • Natural Languages
  • Supervised Machine Learning
  • Unsupervised Machine Learning
  • Vector Spaces
  • Vehicles

Fields of Study

  • Computer science

Readers

  • Distributed Systems and Data Platform Development
  • Economics
  • Systems Analysis and Design

Technology Areas

  • AI & ML
  • AI & ML - DoD AI Strategy