Topic Similarity Networks: Visual Analytics for Large Document Sets

Abstract

We investigate ways in which to improve the interpretability of LDA topic models by better analyzing and visualizing their outputs. We focus on examining what we refer to as topic similarity networks: graphs in which nodes represent latent topics in text collections and links represent similarity among topics. We describe efficient and effective approaches to both building and labeling such networks. Visualizations of topic models based on these networks are shown to be a powerful means of exploring, characterizing, and summarizing large collections of unstructured text documents. They help to tease out non-obvious connections among different sets of documents and provide insights into how topics form larger themes. We demonstrate the efficacy and practicality of these approaches through two case studies: 1) NSF grants for basic research spanning a 14 year period and 2) the entire English portion of Wikipedia.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Jul 14, 2014
Accession Number: AD1123718

Entities

People

Arun S. Maiya
Robert M. Rolfe

Organizations

Institute for Defense Analyses

Topic Similarity Networks: Visual Analytics for Large Document Sets

Abstract

Document Details

Entities

People

Organizations

Tags

Communities of Interest

DTIC Thesaurus Topics

Fields of Study

Readers