Scalable Topic Modeling: Online Learning, Diagnostics, and Recommendation

Abstract

The main activity of my research group is to build and develop the probabilistic pipeline. When solving problems with data, we take the following steps. 1. We make assumptions about our data, embedding it in a probability model containing hidden and observed random variables. 2. Given observations, we use inference algorithms to estimate the conditional distribution of the hidden variables. This is the central statistical and computational problem. 3. With the results of inference, we use our model to form predictions about the future, explore the data, or otherwise apply what we learned to solve a problem. 4. We criticize our model, understand where it went right and wrong, and repeat the process to revise it.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Mar 01, 2017
Accession Number
AD1028715

Entities

People

  • David M. Blei

Organizations

  • Columbia University

Tags

Communities of Interest

  • Autonomy

DTIC Thesaurus Topics

  • Algorithms
  • Artificial Intelligence
  • Artificial Intelligence Software
  • Bayesian Networks
  • Data Analysis
  • Data Mining
  • Data Science
  • Deep Learning
  • Distance Learning
  • Factor Analysis
  • Information Processing
  • Information Science
  • Information Systems
  • Machine Learning
  • Probability
  • Social Media
  • Statistics

Fields of Study

  • Computer science

Readers

  • Computational Fluid Dynamics (CFD)
  • Team-Based Human-Centered Cognitive Task Decision Making and Information Performance.
  • Theoretical Analysis.

Technology Areas

  • AI & ML
  • AI & ML - Bayesian Inference