An Improved Algorithm for Unsupervised Decomposition of a Multi-x19;Author Document

Abstract

This paper addresses the problem of unsupervised decomposition of a multi-author text document: identifying the sentences that were written by each author assuming the number of authors is unknown. An approach, BayesAD, is developed for solving this problem: apply a Bayesian segmentation algorithm, followed by a segment-clustering algorithm. Results are presented from an empirical comparison between BayesAD and AK, a modified version of an approach published by Akiva and Koppel in 2013. BayesAD exhibited greater accuracy than AK in all experiments. However, BayesAD has a parameter that needs to be set and which had a non-trivial impact on accuracy. Developing an effective method for eliminating this need would be a fruitful direction for future work. When controlling for topic, the accuracy of BayesAD and AK were, in all but one case, worse than a baseline approach wherein one author was assumed to write all sentences in the input text document. Hence, room for improved solutions exists.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 01, 2014
Accession Number
AD1107700

Entities

People

  • Chris Giannella

Organizations

  • MITRE Corporation

Tags

Communities of Interest

  • Autonomy

DTIC Thesaurus Topics

  • Algorithms
  • Artificial Neural Networks
  • Bayesian Networks
  • Computational Linguistics
  • Computational Science
  • Dynamic Programming
  • Eigenvectors
  • Generative Models
  • Image Processing
  • Image Segmentation
  • Information Processing
  • Information Retrieval
  • Information Science
  • Information Systems
  • Language
  • Linguistics
  • Machine Learning
  • Natural Language Processing
  • Natural Languages
  • Neural Networks
  • Probability
  • Probability Distributions
  • Standards

Readers

  • Applied Combinatorial Optimization and Logic Circuit Design.
  • Computational Linguistics
  • Computer Vision.

Technology Areas

  • AI & ML
  • AI & ML - Bayesian Inference
  • AI & ML - Machine Learning Algorithms
  • AI & ML - Machine Translation