Automated Story Capture From Internet Weblogs

Abstract

Among the most interesting ways that people share knowledge is through the telling of stories, i.e. first-person narratives about real-life experiences. Millions of these stories appear in Internet weblogs, offering a potentially valuable resource for future knowledge management and training applications. In this paper we describe efforts to automatically capture stories from Internet weblogs by extracting them using statistical text classification techniques. We evaluate the precision and recall performance of competing approaches. We describe the large-scale application of story extraction technology to Internet weblogs, producing a corpus of stories with over a billion words.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Jan 01, 2007
Accession Number: ADA470419

Entities

People

Andrew S. Gordon
Qun Cao
Reid Swanson

Organizations

University of Southern California

Automated Story Capture From Internet Weblogs

Abstract

Document Details

Entities

People

Organizations

Tags

Communities of Interest

DTIC Thesaurus Topics

Fields of Study

Readers

Technology Areas