The Harmonic Sieve: A Novel Application of Fourier Analysis to Machine Learning Theory and Practice.

Abstract

This thesis presents new positive results--both theoretical and empirical--in machine learning. The primary learning-theoretic contribution is the Harmonic Sieve, the first efficient algorithm for learning the well-studied class of Disjunctive Normal Form (DNF) expressions (learning is accomplished within the Probably Approximately Correct model with respect to the uniform distribution using membership queries). Of particular interest is the novel use of Fourier methods within the algorithm. Specifically, all prior Fourier-based learning algorithms focused on finding large Fourier coefficients of the function to be learned (the target). The Harmonic Sieve departs from this paradigm; it instead learns by finding large coefficients of certain functions other than the target. The robustness of this new Fourier technique is illustrated by applying it to prove learnability of noisy DNF expressions, of a circuit class that is even more expressive than DNF, and of an interesting class of geometric concepts. Empirically, the thesis demonstrates the significant practical potential of a classification-learning algorithm closely related to the Harmonic Sieve. The Boosting-based Perceptron (BBP) learning algorithm produces classifiers that are nonlinear perceptrons (weighted thresholds over higher-order features). On several previously-studied machine learning benchmarks, the BBP algorithm produces classifiers that achieve accuracies essentially equivalent to or even better than the best previously-reported classifiers. Additionally, the perceptrons produced by the BBP algorithm tend to be relatively intelligible, an important feature in many machine learning applications. In a related vein, BBP and the Harmonic Sieve are applied successfully to the problem of rule extraction, that is, the problem of approximating an unintelligible classifier by a more intelligible function.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Aug 23, 1995
Accession Number
ADA303368

Entities

People

  • Jeffrey C. Jackson

Organizations

  • Carnegie Mellon University

Tags

Communities of Interest

  • Autonomy
  • C4I
  • Ground and Sea Platforms
  • Human Systems

DTIC Thesaurus Topics

  • Algorithms
  • Amino Acids
  • Artificial Intelligence
  • Computational Science
  • Computer Programming
  • Computer Science
  • Data Sets
  • Fourier Analysis
  • Information Processing
  • Information Science
  • Machine Learning
  • Neural Networks
  • Probability
  • Probability Distributions
  • Random Variables
  • Recognition
  • Test Sets

Fields of Study

  • Computer science

Readers

  • Approximation Theory.
  • Neural Network Machine Learning.

Technology Areas

  • AI & ML
  • AI & ML - Bayesian Inference
  • AI & ML - Machine Learning Algorithms
  • AI & ML - Neural Networks