When are Overcomplete Representations Identifiable? Uniqueness of Tensor Decompositions Under Expansion Constraints

Abstract

Overcomplete latent representations have been very popular for unsupervised feature learning in recent years. In this paper, we specify which overcomplete models can be identified given observable moments of a certain order. We consider probabilistic admixture or topic models in the overcomplete regime. While general overcomplete admixtures are not identifiable, we establish generic identifiability under a constraint, referred to as topic persistence. Our sufficient conditions for identifiability involve a novel set of expansion conditions on the population structure (i.e. the topic-word matrix) of the persistent topic model. Specifically, we require the existence of a perfect matching from hidden variables to higher order observed variables, and can thus, incorporate overcomplete models. In particular, we establish that random models are identifiable w.h.p. in the overcomplete regime. Moreover, our analysis allows for general (nondegenerate) distributions for modeling the topic proportions. Our proof techniques incorporate a novel class of tensor decompositions which fall in between the well-known candecomp/parafac (CP) and Tucker decompositions, and provide novel conditions for unique tensor decomposition.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Jun 16, 2013
Accession Number: ADA604842

Entities

People

Anima Anandkumar
Daniel Hsu
Majid Janzamin
Sham Kakade

Organizations

University of California, Irvine

When are Overcomplete Representations Identifiable? Uniqueness of Tensor Decompositions Under Expansion Constraints

Abstract

Document Details

Entities

People

Organizations

Tags

Communities of Interest

DTIC Thesaurus Topics

Fields of Study

Readers