Enforced Sparse Non-Negative Matrix Factorization

Abstract

Non-negative matrix factorization (NMF) is a dimensionality reduction algorithm for data that can be represented as an undirected bipartite graph. It has become a common method for generating topic models of text data because it is known to produce good results, despite its relative simplicity of implementation and ease of computation. One challenge with applying the NMF to large datasets is that intermediate matrix products often become dense, thus stressing the memory and compute elements of the underlying system. In this article, we investigate a simple but powerful modification of the alternating least squares method of determining the NMF of a sparse matrix that enforces the generation of sparse intermediate and output matrices. This method enables the application of NMF to large datasets through improved memory and compute performance. Further, we demonstrate, empirically, that this method of enforcing sparsity in the NMF either preserves or improves both the accuracy of the resulting topic model and the convergence rate of the underlying algorithm.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 23, 2016
Accession Number
AD1033864

Entities

People

  • Brendan Gavin
  • Jeremy Kepner
  • Vijay Gadepally

Organizations

  • MIT Lincoln Laboratory

Tags

Communities of Interest

  • Biomedical

DTIC Thesaurus Topics

  • Abstracts
  • Accuracy
  • Algorithms
  • Artificial Intelligence Software
  • Bayesian Networks
  • Clustering
  • Computational Science
  • Computations
  • Data Sets
  • Governments
  • Heuristic Methods
  • Machine Learning
  • Natural Language Processing
  • Sparse Matrix
  • Text Mining
  • United States
  • United States Government

Fields of Study

  • Computer science

Readers

  • Graph Algorithms and Convex Optimization.
  • Neural Network Machine Learning.