Actively Learning Specific Function Properties with Applications to Statistical Inference

Abstract

Active learning techniques have previously been shown to be extremely effective for learning a target function over an entire parameter space based on a limited set of observations. However, in many cases, only a specific property of the target function needs to be learned. For instance, when discovering the boundary of a region such as the locations in which the wireless network strength is above some operable level, we are interested in learning only the level-set of the target function. While techniques that learn the entire target function over the parameter space can be used to detection specific properties of the target function (e.g. level-sets), methods that learn only the required properties can be significantly more efficient, especially as the dimensionality of the parameter space increases. These active learning algorithms have a natural application in many statistical inference techniques. For example, given a set of data and a physical model of the data, which is a function of several variables, a scientist is often interested in determining the ranges of the variables which are statistically supported by the data. We show that many frequentist statistical inference techniques can be reduced to a level-set detection problem or similar search of a property of the target function , and hence benefit from active learning algorithms which target specific properties. Using these active learning algorithms significantly decreases the number of experiments required to accurately detect the boundaries of the desired 1 confidence regions. Moreover, since computing the model of the data given the input parameters may be expensive (either computationally, or monetarily), such algorithms can facilitate analyses that were previously infeasible. We demonstrate the use of several statistical inference techniques combined with active learning algorithms on several cosmological data sets.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Dec 01, 2007
Accession Number
ADA480995

Entities

People

  • Brent Bryan

Organizations

  • Carnegie Mellon University

Tags

Communities of Interest

  • Energy and Power Technologies
  • Materials and Manufacturing Processes

DTIC Thesaurus Topics

  • Bayesian Networks
  • Computational Fluid Dynamics
  • Computational Science
  • Data Mining
  • Data Science
  • Game Theory
  • Information Processing
  • Information Science
  • Machine Learning
  • Monte Carlo Method
  • Network Science
  • Neural Networks
  • Statistical Algorithms
  • Statistical Analysis
  • Statistical Inference
  • Surveys
  • Two Dimensional

Fields of Study

  • Computer science

Readers

  • Agent-Based Social Robotics and Mobile-Assisted Learning in Virtual Environments.
  • Computer Vision.
  • Statistical inference.

Technology Areas

  • AI & ML
  • AI & ML - Bayesian Inference
  • AI & ML - Machine Learning Algorithms
  • AI & ML - Neural Networks
  • Space
  • Space - Space Objects