Part-of-Speech Tagging for Twitter: Annotation, Features, and Experiments
Abstract
We address the problem of part-of-speech tagging for English data from the popular microblogging service Twitter. We develop a tagset, annotate data, develop features, and report tagging results nearing 90% accuracy. The data and tools have been made available to the research community with the goal of enabling richer text analysis of Twitter and related social media data sets.
Document Details
- Document Type
- Technical Report
- Publication Date
- Jan 01, 2010
- Accession Number
- ADA547371
Entities
People
- Brendan T O'Connor
- Dani Yogatama
- Daniel Mills
- Dipanjan Das
- Jacob Eisenstein
- Jeffrey Flanigan
- Kevin Gimpel
- Michael Heilman
- Nathan Schneider
- Noah A. Smith
Organizations
- Carnegie Mellon University