Geometric k-nearest neighbor estimation of entropy and mutual information

Abstract

Nonparametric estimation of mutual information is used in a wide range of scientific problems to quantify dependence between variables. The k-nearest neighbor (knn) methods are consistent, and therefore expected to work well for a large sample size. These methods use geometrically regular local volume elements. This practice allows maximum localization of the volume elements, but can also induce a bias due to a poor description of the local geometry of the underlying probability measure. We introduce a new class of knn estimators that we call geometric knn estimators (g-knn), which use more complex local volume elements to better model the local geometry of the probability measures. As an example of this class of estimators, we develop a g-knn estimator of entropy and mutual information based on elliptical volume elements, capturing the local stretching and compression common to a wide range of dynamical system attractors. A series of numerical examples in which the thickness of the underlying distribution and the sample sizes are varied suggest that local geometry is a source of problems for knn methods such as the Kraskov-Stögbauer-Grassberger estimator when local geometric effects cannot be removed by global preprocessing of the data. The g-knn method performs well despite the manipulation of the local geometry. In addition, the examples suggest that the g-knn estimators can be of particular relevance to applications in which the system is large, but the data size is limited.

Document Details

Document Type
Pub Defense Publication
Publication Date
Mar 01, 2018
Source ID
10.1063/1.5011683

Entities

People

  • Erik M. Bollt
  • Jie Sun
  • Warren M Lord

Organizations

  • Army Research Office
  • Clarkson University
  • Office of Naval Research

Tags

Readers

  • Computer Networking
  • Finite Element Method (FEM) for solving Partial Differential Equations (PDEs)
  • Statistical inference.