Continuous Time Dynamic Topic Models

Abstract

In this paper, we develop the continuous time dynamic topic model (cDTM). The cDTM is a dynamic topic model that uses Brownian motion to model the latent topics through a sequential collection of documents, where a "topic" is a pattern of word use that we expect to evolve over the course of the collection. We derive an efficient variational approximate inference algorithm that takes advantage of the sparsity of observations in text, a property that lets us easily handle many time points. In contrast to the cDTM, the original discrete-time dynamic topic model (dDTM) requires that time be discretized. Moreover, the complexity of variational inference for the dDTM grows quickly as time granularity increases, a drawback which limits fine-grained discretization. We demonstrate the cDTM on two news corpora reporting both predictive perplexity and the novel task of time stamp prediction.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jun 20, 2008
Accession Number
ADA633298

Entities

People

  • Chong Wang
  • David Heckerman
  • David M. Blei

Organizations

  • Princeton University

Tags

Communities of Interest

  • Materials and Manufacturing Processes

DTIC Thesaurus Topics

  • Algorithms
  • Bayesian Networks
  • Brownian Motion
  • Computational Complexity
  • Computational Science
  • Computer Science
  • Data Science
  • Data Sets
  • Filtration
  • Information Retrieval
  • Information Science
  • Kalman Filtering
  • Kalman Filters
  • Machine Learning
  • Natural Language Processing
  • Observation
  • Probability

Fields of Study

  • Computer science

Readers

  • Educational Psychology
  • Neural Network Machine Learning.
  • Statistical inference.

Technology Areas

  • AI & ML
  • AI & ML - Bayesian Inference
  • AI & ML - Information Retrieval
  • AI & ML - Machine Learning Algorithms