Extracting Dynamic Evidence Networks
Abstract
BBN's primary goal was to dramatically increase the accuracy of evidence extraction. Using a hybrid of statistical learning algorithms and handcrafted patterns, SERIF achieved 93% of human performance in extracting entities, events, and relations, and 96% of human performance in extracting relations given entities and events. A second performance objective was to be able to extract entities that have names at 80% of human performance. This performance was then further improved in the relation extraction work done in 2004. An additional objective was to have a prototype robust enough that it could extract evidence continually (24x7) from a daily English news feed. All objectives were achieved. BBN's SERIF system also represents a significant advance for extraction systems in architecture and implementation. The combination of general linguistic models trained on preexisting corpora with domain specific components trained for the particular task allows powerful linguistic analysis tools to be brought to bear on extracting the relations and events of a new domain. The use of propositions as an intermediate step was an important part of this strategy, encapsulating the literal meaning of the text from which the target relations could then be derived.
Document Details
- Document Type
- Technical Report
- Publication Date
- Dec 01, 2004
- Accession Number
- ADA429898
Entities
People
- Ralph Weischedel
Organizations
- BBN Technologies