Individual Profiling Using Text Analysis

Abstract

Author profiling is the task of determining the attributes for a set of authors. This report presents the design, approach, and results of our approach to using data from the PAN 2015 Author Profiling Shared Task to predict personal attributes, as per the project brief. Four corpora, each in a different language, were provided. Each corpus consisted of collections of tweets for a number of Twitter users whose gender, age and personality scores are known. The task was to construct some system capable of inferring the same attributes on as yet unseen authors. Our system utilizes two sets of text based features, n-grams and topic models, in conjunction with Support Vector Machines to predict gender, age and personality scores. We ran our system on each dataset and received results indicating that n-grams and topic models are effective features across a number of languages.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Apr 15, 2016
Accession Number
AD1009417

Entities

People

  • Mark Stevenson

Organizations

  • University of Sheffield

Tags

Communities of Interest

  • Autonomy

DTIC Thesaurus Topics

  • Air Force Research Laboratories
  • Computer Science
  • Contracts
  • Electronic Mail
  • Feature Extraction
  • Language
  • Machine Learning
  • Media
  • Network Science
  • Online Communications
  • Psychology
  • Social Media
  • Social Networking Services
  • Social Networks
  • Supervised Machine Learning
  • Training

Fields of Study

  • Computer science

Readers

  • Information Retrieval
  • Neural Network Machine Learning.
  • Organizational Psychology.

Technology Areas

  • AI & ML
  • AI & ML - Information Retrieval
  • AI & ML - Machine Translation