Antisampling for Estimation: An Overview,

Abstract

We survey a new way to get quick estimates of the values of simple statistics (like count, mean, standard deviation, maximum, median, and mode frequency) on a large data set. This approach is a comprehensive attempt (apparently the first) to estimate statistics without any sampling, by reasoning about various sets containing a population interest. Our antisampling techniques have connections to those of sampling (and have duals in many cases), but they have different advantages and disadvantages, making antisampling sometimes preferable to sampling, sometimes not. In particular, they can only be efficient when data is in a computer, and they exploit computer science ideas such as production systems and database theory. Antisampling also requires the overhead of construction of an auxiliary structure, a database abstract . Tests on sample data show similar or better performance than simple random sampling. We also discuss more complex methods of sampling and their disadvantages.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Oct 01, 1984
Accession Number
ADA148545

Entities

People

  • N. C. Rowe

Organizations

  • Naval Postgraduate School

Tags

Communities of Interest

  • Ground and Sea Platforms

DTIC Thesaurus Topics

  • Computer Science
  • Computers
  • Data Analysis
  • Data Science
  • Data Sets
  • Databases
  • Information Science
  • Network Science
  • New York
  • Order Statistics
  • Probability
  • Probability Distributions
  • Statistical Algorithms
  • Statistical Analysis
  • Statistical Sampling
  • Statistics
  • Surveys

Readers

  • Applied Combinatorial Optimization and Logic Circuit Design.
  • Regression Analysis.
  • Systems Analysis and Design