Applications of Latent Variable Models in Modeling Influence and Decision Making

Abstract

The past 20 years have seen an avalanche of digital information which is overwhelming people in industry, government, and academics. This avalanche is two-sided: while the past decade has seen an onslaught of digitized records-as governments, publishers, and researchers race to make their records digital, the electronic and software tools for computationally analyzing this data have quickly evolved to face this challenge. Many of these challenges evolve around recurring patterns, including the presence of text, bits of information about pairs of items, and sequential observations. In this work we present several methods to address these challenges in data analysis which take advantage of these recurring patterns. We begin with a method for identifying influential documents in a collection which evolves over time. We demonstrate that by encoding our assumptions about influential documents in a statistical model of the changes in textual themes, we are able to provide an alternative bibliometric which provides results consistent with yet different from traditional metrics of influence such as citation counts. We then introduce a model for measuring the relationships between pairs of countries over time. We will demonstrate that this model is able to learn meaningful relationships between countries which is extraordinarily consistent across different human labels. We next address limitations in existing models of legislative voting. In one extension we predict legislators' votes by using the text of the bills they are voting on combined with individual legislators' past voting behavior. We then introduce a method for inferring these lawmakers' positions on specific issues. A recurring theme in the methods we present is that by using a small set of statistical primitives, we are able to apply known (or mildly adapted) methods to new problems. Several advances in the past few decades in statistical modeling will make the development and discussion of our models easier.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Apr 01, 2013
Accession Number
ADA583723

Entities

People

  • Sean M. Gerrish

Organizations

  • Princeton University

Tags

Communities of Interest

  • Autonomy
  • C4I
  • Ground and Sea Platforms
  • Human Systems
  • Weapons Technologies

DTIC Thesaurus Topics

  • Civil Rights
  • Cognitive Science
  • Computational Linguistics
  • Computational Science
  • Data Mining
  • Gaussian Distributions
  • Information Processing
  • Information Retrieval
  • Information Science
  • Machine Learning
  • Monte Carlo Method
  • Natural Language Processing
  • Network Science
  • Personnel Management
  • Probabilistic Models
  • Probability Distributions
  • Recreation

Fields of Study

  • Computer science

Readers

  • Computational Modeling and Simulation
  • Educational Psychology
  • Neural Network Machine Learning.

Technology Areas

  • AI & ML
  • Microelectronics