On the Computation of the Relative Entropy of Probabilistic Automata

Abstract

We present an exhaustive analysis of the problem of computing the relative entropy of two probabilistic automata. We show that the problem of computing the relative entropy of unambiguous probabilistic automata can be formulated as a shortest-distance problem over an appropriate semiring, give efficient exact and approximate algorithms for its computation in that case, and report the results of experiments demonstrating the practicality of our algorithms for very large weighted automata. We also prove that the computation of the relative entropy of arbitrary probabilistic automata is PSPACE-complete. The relative entropy is used in a variety of machine learning algorithms and applications to measure the discrepancy of two distributions. We examine the use of the symmetrized relative entropy in machine learning algorithms and show that, contrarily to what is suggested by a number of publications, the symmetrized relative entropy is neither positive definite symmetric nor negative definite symmetric, which limits its use and application in kernel methods. In particular, the convergence of training for learning algorithms is not guaranteed when the symmetrized relative entropy is used directly as a kernel, or as the operand of an exponential as in the case of Gaussian Kernels. Finally, we show that our algorithm for the computation of the entropy of an unambiguous probabilistic automaton can be generalized to the computation of the norm of an unambiguous probabilistic automaton by using a monoid morphism. In particular, this yields efficient algorithms for the computation of the L(sub p)-norm of a probabilistic automaton.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 17, 2007
Accession Number
ADA606160

Entities

People

  • Ashish Rastogi
  • Corinna Cortes
  • Mehryar Mohri
  • Michael Riles

Organizations

  • New York University

Tags

Communities of Interest

  • Autonomy

DTIC Thesaurus Topics

  • Artificial Intelligence
  • Automata Theory
  • Computational Science
  • Computer Languages
  • Computer Science
  • Convergence
  • Formal Languages
  • Information Processing
  • Language
  • Machine Learning
  • Machines
  • Markov Models
  • Natural Language Processing
  • Probabilistic Models
  • Probability
  • Probability Distributions
  • Supervised Machine Learning

Fields of Study

  • Computer science

Readers

  • Approximation Theory.
  • Graph Algorithms and Convex Optimization.

Technology Areas

  • AI & ML
  • AI & ML - Machine Learning Algorithms