BIT and Purdue at TREC-KBA-CCR Track 2014

Abstract

This report summarizes our participation at KBA-CCR track in TREC 2014. Our submissions are generated in two steps: (1) Filtering a candidate documents collection from the stream corpus for a set of target entities; and (2) Estimating the relevance levels between candidate documents and target entities. Three kinds of approaches are employed in the second step, including query expansion, classification and learning to rank. Query expansion is an unsupervised baseline by combining an entity and its related entities as a query to retrieve its relevant documents. Query expansion performs considerably well in vital + useful scenario. It's not difficult to filter a relevant document set from the stream corpus. However, in vital only scenario, supervised approaches are more powerful than query expansion in identifying vital documents for target entities. Our results reveal that learning to rank approaches are more suitable for CCR with current evaluation methodology.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Nov 01, 2014
Accession Number
ADA618570

Entities

People

  • Dandan Song
  • Jingang Wang
  • Lejian Liao
  • Luo Si
  • Ning Zhang
  • Zhiwei Zhang

Organizations

  • Purdue University

Tags

Communities of Interest

  • Materials and Manufacturing Processes

DTIC Thesaurus Topics

  • Abstracts
  • Classification
  • Computer Science
  • Computers
  • Data Science
  • Equations
  • Filters
  • Filtration
  • Information Operations
  • Information Retrieval
  • Information Science
  • Learning
  • New York
  • Standards
  • Statistics
  • Test And Evaluation
  • Training

Fields of Study

  • Computer science

Readers

  • Information Retrieval
  • Instructional Design and Training Evaluation.
  • Speech Processing/Speech Recognition.