Assessing the Calibration of Naive Bayes Posterior Estimates
Abstract
In this paper, we give evidence that the posterior distribution of Naive Bayes goes to zero or one exponentially with document length. While exponential change may be expected as new bits of information are added, adding new words does not always correspond to new information. Essentially as a result of its independence assumption, the estimates grow too quickly. We investigate one parametric family that attempts to downweight the growth rate. The parameters of this family are estimated using a maximum likelihood scheme, and the results are evaluated.
Document Details
- Document Type
- Technical Report
- Publication Date
- Sep 12, 2000
- Accession Number
- ADA385120
Entities
People
- Paul N. Bennett
Organizations
- Carnegie Mellon University