Optimal Mixture Models in IR

Abstract

We explore the use of Optimal Mixture Models to represent topics. We analyze two broad classes of mixture models: set-based and weighted. We provide an original proof that estimation of set-based models is NP-hard, and therefore not feasible. We argue that weighted models are superior to set-based models, and the solution can be estimated by a simple gradient descent technique. We demonstrate that Optimal Mixture Models can be successfully applied to the task of document retrieval. Our experiments show that weighted mixtures outperform a simple language modeling baseline. We also observe that weighted mixtures are more robust than other approaches of estimating topical models.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Jan 01, 2005
Accession Number: ADA440363

Entities

People

Victor Lavrenko

Organizations

University of Massachusetts Amherst

Optimal Mixture Models in IR

Abstract

Document Details

Entities

People

Organizations

Tags

Communities of Interest

DTIC Thesaurus Topics

Fields of Study

Readers