Coverage Adjusted Entropy Estimation
Abstract
Data on "neural coding" have frequently been analyzed using information-theoretic measures. These formulations involve the fundamental, and generally difficult statistical problem of estimating entropy. We review briefly several methods that have been advanced to estimate entropy, and highlight a method, the coverage adjusted entropy estimator (CAE), due to Chao and Shen that appeared recently in the environmental statistics literature. This method begins with the elementary Horvitz-Thompson estimator, developed for sampling from a finite population and adjusts for the potential new species that have not yet been observed in the sample - these become the new patterns or "words" in a spike train that have not yet been observed. The adjustment is due to I.J. Good, and is called the Good-Turing coverage estimate. We provide a new empirical regularization derivation of the coverage-adjusted probability estimator, which shrinks the MLE. We prove that the CAE is consistent and first-order optimal, with rate O(sub-p)[1/ log n], in the class of distributions with finite entropy variance and that within the class of distributions with finite qth moment of the log-likelihood, the Good-Turing coverage estimate and the total probability of unobserved words converge at rate O(sub-p)[1/(log n)exp q]. We then provide a simulation study of the estimator with standard distributions and examples from neuronal data, where observations are dependent. The results show that, with a minor modification, the CAE performs much better than the MLE and is better than the Best Upper Bound estimator, due to Paninski, when the number of possible words m is unknown or infinite.
Document Details
- Document Type
- Technical Report
- Publication Date
- Jun 05, 2007
- Accession Number
- ADA472999
Entities
People
- Bin Yu
- Robert E. Kass
- Vincent Q. Vu
Organizations
- University of California, Berkeley