Discriminating Gender on Twitter

Abstract

Accurate prediction of demographic attributes from social media and other informal online content is valuable for marketing, personalization, and legal investigation. This paper describes the construction of a large, multilingual dataset labeled with gender, and investigates statistical models for determining the gender of uncharacterized Twitter users.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 01, 2011
Accession Number
AD1108485

Entities

People

  • George Kim
  • Guido Zarrella
  • John D. Burger
  • John Henderson

Organizations

  • MITRE Corporation

Tags

Communities of Interest

  • Autonomy

DTIC Thesaurus Topics

  • Computational Linguistics
  • Computational Science
  • Data Mining
  • Information Science
  • Language
  • Linguistics
  • Machine Learning
  • Motor Skills
  • Natural Languages
  • Online Communications
  • Social Media
  • Supervised Machine Learning
  • Test Sets
  • Websites

Readers

  • Database Systems and Applications
  • Information Retrieval
  • Organizational Psychology.