WHU at TREC KBA Vital Filtering Track 2014

Abstract

This paper describes the WHU IRLAB participation to the Vital Filtering task of the TREC 2014 Knowledge Base Acceleration Track. In this task, we implemented a system to detect vital documents that could be used for a human editor to update or create the profile of an entity. Our approach is to view the problem as a classification problem and use Stanford NLP Toolkit to extract necessary information. Various kinds of features are leveraged to classify documents to three classes, i.e. vital, useful and non-useful (garbage or neutral). We submitted four runs using different combinations of features. The results are presented and discussed.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Nov 01, 2014
Accession Number
ADA618666

Entities

People

  • Chuan Wu
  • Pengcheng Zhou
  • Wei Lu
  • Xiaohua Feng

Organizations

  • Wuhan University

Tags

Communities of Interest

  • Materials and Manufacturing Processes

DTIC Thesaurus Topics

  • Abstracts
  • Classification
  • Engineering
  • Filtration
  • Information Operations
  • Information Science
  • Machine Learning
  • Schools
  • Social Sciences
  • Standards
  • Statistics
  • Time Intervals
  • Training
  • Universities

Fields of Study

  • Computer science

Readers

  • Database Systems and Applications
  • Information Retrieval