Scalable Topic Modeling: Online Learning, Diagnostics, and Recommendation

Abstract

The main activity of my research group is to build and develop the probabilistic pipeline. When solving problems with data, we take the following steps. 1. We make assumptions about our data, embedding it in a probability model containing hidden and observed random variables. 2. Given observations, we use inference algorithms to estimate the conditional distribution of the hidden variables. This is the central statistical and computational problem. 3. With the results of inference, we use our model to form predictions about the future, explore the data, or otherwise apply what we learned to solve a problem. 4. We criticize our model, understand where it went right and wrong, and repeat the process to revise it.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Mar 01, 2017
Accession Number: AD1028715

Entities

People

David M. Blei

Organizations

Columbia University

Scalable Topic Modeling: Online Learning, Diagnostics, and Recommendation

Abstract

Document Details

Entities

People

Organizations

Tags

Communities of Interest

DTIC Thesaurus Topics

Fields of Study

Readers

Technology Areas