Assessing the Calibration of Naive Bayes Posterior Estimates

Abstract

In this paper, we give evidence that the posterior distribution of Naive Bayes goes to zero or one exponentially with document length. While exponential change may be expected as new bits of information are added, adding new words does not always correspond to new information. Essentially as a result of its independence assumption, the estimates grow too quickly. We investigate one parametric family that attempts to downweight the growth rate. The parameters of this family are estimated using a maximum likelihood scheme, and the results are evaluated.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Sep 12, 2000
Accession Number
ADA385120

Entities

People

  • Paul N. Bennett

Organizations

  • Carnegie Mellon University

Tags

Communities of Interest

  • Autonomy

DTIC Thesaurus Topics

  • Calibration
  • Classification
  • Computer Languages
  • Computer Science
  • Data Sets
  • Dimensionality Reduction
  • Equations
  • Feature Selection
  • Information Science
  • Learning
  • Machine Learning
  • Probability
  • Reliability
  • Supervised Machine Learning
  • Test And Evaluation
  • Training
  • Universities

Fields of Study

  • Mathematics

Readers

  • Computational Modeling and Simulation
  • Statistical inference.