A Preliminary Test for Structure in Large, High-Dimensional Data Sets.

Abstract

We present a natural preliminary test for the presence of structure (nontrivial dependence) in a data set, and give some examples of its use. The procedure consists of sphering the data to remove correlations, then binning or discretizing the data, and finally, studying the cell counts in the resulting contingency table. If this procedure detects structure, we can then use more computationally intensive methods to determine the nature of this structure.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Sep 05, 1991
Accession Number
ADA240832

Entities

People

  • Cheolyong Park
  • Fred W. Huffer

Organizations

  • Stanford University

Tags

DTIC Thesaurus Topics

  • Cell Count
  • Computer Programs
  • Computers
  • Coordinate Systems
  • Data Sets
  • Distribution Functions
  • Frequency
  • Military Research
  • Normal Distribution
  • Numbers
  • Observation
  • Random Variables
  • Sampling
  • Skewness
  • Statistical Sampling
  • Statistics
  • United States

Readers

  • Artificial Intelligence
  • Distributed Systems and Data Platform Development
  • Regression Analysis.