Extraction of Key Words from News Stories

Abstract

In this work, we consider the task of extracting key-words such as key-players, key-locations, key-nouns and key-verbs from news stories. We cast this problem as a classification problem wherein we assign appropriate labels to each word in a news story. We considered statistical models such as naive Bayes model, hidden Markov model and maximum entropy model in our work. We have also experimented with various features. Our results indicate that a maximum entropy model that ignores contextual features and considers only word-based features combined with stopping and stemming yields the best performance. We found that extraction of keyverbs and key-nouns is a much harder problem than extracting keyplayers and key-locations.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Jan 01, 2004
Accession Number: ADA477769

Entities

People

James Allan
Ramesh Nallapati
Sridhar Mahadevan

Organizations

University of Massachusetts Amherst

Extraction of Key Words from News Stories

Abstract

Document Details

Entities

People

Organizations

Tags

DTIC Thesaurus Topics

Fields of Study

Readers