Automatic Text Categorization Applied to E-Mail
Abstract
The author developed an automatic text categorization approach and investigated its application upon categorizing emails. The categorization approach is derived from an instanced- based learning method that explores conditional probabilities of particular words. The effectiveness of the author's categorization approach using collections from a set of emails is then evaluated and assigned a numerical score based upon precision and recall. Precision was 65% while recall was 17%. The author's experiments indicated automatic categorization of incoming emails at the client level can categorize email, but is difficult when not using a standardized corpus. Word frequency is valuable, but should be used in combination with other methods such as phrase extraction for a higher level of performance.
Document Details
- Document Type
- Technical Report
- Publication Date
- Sep 01, 2002
- Accession Number
- ADA406989
Entities
People
- Scott R. Hall
Organizations
- Naval Postgraduate School