Fabricating Synthetic Data in Support of Training for Domestic Terrorist Activity Data Mining Research

Abstract

Data mining is a mature technology, widespread in both government and industry. The proliferation of data storage in public and private sectors has provided more information than can be expediently processed. Data mining provides a means to extract meaningful conclusions from this growing store of data. In the interests of countering criminal and terrorist activity, data mining has become a focus of law enforcement and government agencies. The use of databases containing information on persons may conflict with privacy rights and laws. Gathering public awareness of government data mining programs and databases has been accompanied with concern and investigation of these programs. Following a review of data mining and privacy issues, in 2008 the National Research Council (NRC) recommended any training in development of data mining programs involving personal data be conducted using synthesized data. This thesis seeks to present an underlying discussion of these issues, to include data mining use, a simple data synthesis model for analysis to support the validity of the NRC recommendation, and the associated difficulties encountered in the process. Included is an analysis of the inherent difficulty in creating realistic and useful data.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Sep 01, 2010
Accession Number
ADA531562

Entities

People

  • Stephen Lavelle

Organizations

  • Naval Postgraduate School

Tags

Communities of Interest

  • Cyber
  • Energy and Power Technologies
  • Ground and Sea Platforms
  • Materials and Manufacturing Processes

DTIC Thesaurus Topics

  • Biometric Security
  • Communication Systems
  • Computational Science
  • Computer Programming
  • Computer Programs
  • Computers
  • Data Analysis
  • Data Mining
  • Databases
  • Electronic Mail
  • Governments
  • Information Systems
  • International Trade
  • Law
  • National Security
  • Network Science
  • United States

Readers

  • Database Systems and Applications
  • Government and Public Administration Law.
  • Systems Analysis and Design

Technology Areas

  • AI & ML
  • AI & ML - DoD AI Strategy