Implementation and Performance Exploration of a Cross-Genre Part of Speech Tagging Methodology to Determine Dialog Act Tags in the Chat Domain
Abstract
Internet Relay Chat is a popular means of communication. Because chat data does not follow established grammatical rules, traditional machine learning algorithms perform poorly in tasks such as part-of-speech and dialog-act tagging, and yet the volume of data created makes human analysis impractical. We present a cross-genre part-of-speech tagging methodology and analyze its effectiveness in determining the dialog-act classes of chat posts. Previous methods for determining part-of-speech tags focused on accuracy, were computationally expensive and required human verification. We show that our cross-genre maximum likelihood estimation part-of-speech tagging performs virtually identically to hand-tagged parts-of-speech and that accurate part-of-speech tags are not required for acceptable automatic dialog-act determination. Furthermore, we show that a simple naive Bayes classifier achieves the same performance in a fraction of the time as a carefully trained neural network.
Document Details
- Document Type
- Technical Report
- Publication Date
- Sep 01, 2010
- Accession Number
- ADA531452
Entities
People
- J. R. Hitt
Organizations
- Naval Postgraduate School