Novel Topic Authorship Attribution

Abstract

The practice of using statistical models in predicting authorship (so-called author-attribution models) is long established. Several recent authorship attribution studies have indicated that topic-specific cues impact author-attribution machine learning models. The arrival of new topics should be anticipated rather than ignored in an author attribution evaluation methodology; a model that relies heavily on topic cues will be problematic in deployment settings where novel topics are common. In order to effectively deal with novel topics, we create author and topic vectors and attempt to project out the topic influences from each document. Although our experiments did not validate our assumptions, they do point out a possible problem with a common assumption in authorship attribution research.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Mar 01, 2011
Accession Number
ADA543929

Entities

People

  • Randale J. Honaker

Organizations

  • Naval Postgraduate School

Tags

Communities of Interest

  • Autonomy
  • Ground and Sea Platforms

DTIC Thesaurus Topics

  • Applied Mathematics
  • Data Sets
  • Dimensionality Reduction
  • Electronic Mail
  • Factor Analysis
  • Information Science
  • Linear Algebra
  • Machine Learning
  • Natural Language Processing
  • Network Science
  • Online Communications
  • Probability
  • Probability Distributions
  • Statistical Analysis
  • Supervised Machine Learning
  • Test Sets
  • Vector Spaces

Fields of Study

  • Computer science

Readers

  • Business Analytics
  • Manufacturing Engineering.
  • Team-Based Human-Centered Cognitive Task Decision Making and Information Performance.

Technology Areas

  • AI & ML
  • AI & ML - Information Retrieval
  • AI & ML - Machine Translation