Evaluating Stream Filtering for Entity Profile Updates in TREC 2012, 2013, and 2014 (KBA Track Overview, Notebook Paper)

Abstract

The Knowledge Base Acceleration (KBA) track ran in TREC 2012, 2013, and 2014 as an entitycentric filtering evaluation. This track evaluates systems that filter a time-ordered corpus for documents and slot fills that would change an entity profile in a predefined list of entities. Compared with the 2012 and 2013 evaluations, the 2014 evaluation introduced several refinements, including high-quality community metadata from running Raytheon/BBN's Serif named entity recognizer, sentence parser, and relation extractor on 579,838,246 English documents in the corpus. We also expanded the query entities to be primarily long-tail entities that lacked Wikipedia profiles. We simplified the SSF scoring, and also added a third task component for highlighting creative systems that used the KBA data. A successful KBA system must do more than resolve the meaning of entity mentions by linking documents to the KB: it must also distinguish novel "vitally" relevant documents and slot fills that would change a target entity's profile. This combines thinking from natural language understanding (NLU) and information retrieval (IR). Filtering tracks in TREC have typically used queries based on topics described by a set of keyword queries or short descriptions, and annotators have generated relevance judgments based on their personal interpretation of the topic.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Nov 01, 2014
Accession Number
ADA618626

Entities

People

  • Daniel A. Roberts
  • Ellen Voorhees
  • Ian Soboroff
  • John R. Frank
  • Max Kleiman-weiner

Organizations

  • Massachusetts Institute of Technology

Tags

Communities of Interest

  • Autonomy
  • Energy and Power Technologies

DTIC Thesaurus Topics

  • Communities
  • Computer Languages
  • Filtration
  • Geographic Regions
  • Information Retrieval
  • Information Science
  • Judgment
  • Language
  • Lessons Learned
  • Linguistics
  • Machine Learning
  • Natural Language Processing
  • Natural Language Understanding
  • Natural Languages
  • Ontologies
  • Pattern Recognition
  • Test And Evaluation

Readers

  • Computational Linguistics
  • Information Retrieval

Technology Areas

  • AI & ML
  • AI & ML - Information Retrieval