Deep Versus Broad Methods for Automatic Extraction of Intelligence Information From Text
Abstract
Extraction of intelligence from text data is increasingly becoming automated as software and network technology increases in speed and scope. However, enormous amounts of text data are often available and one must carefully design a data mining strategy to obtain the relevant nuggets of gold from the mountains of useless dross. Two strategies can be tried. A deep approach is to use a few strong clues to find reasonable sentence candidates, then apply linguistic restrictions to find and extract key information (if any) surrounding the candidates. A broad approach is to focus on large numbers of weaker clues such as specific words whose implications can be combined to rate sentences and present those of high likelihood of relevance. In the work reported here, we tested the deep approach on military intelligence reports about enemy positions, which were relatively short text extracts, and we tested the broad approach on news stories from the World Wide Web involving terrorism, which presented a large volume of text information.
Document Details
- Document Type
- Technical Report
- Publication Date
- Jun 01, 2005
- Accession Number
- ADA464116
Entities
People
- Jason Sparks
- Jonathan Vorrath
- Jonathan Wintrode
- Matthew Lear
- Neil C. Rowe
Organizations
- Naval Postgraduate School