Entity Came to Rescue - Leveraging Entities to Minimize Risks in Web Search

Abstract

We present the summary of our work in the TREC 2014 Web Track. We participated both the ad hoc task and risk- sensitive task and explored two entity-based approaches to evaluate the performance of leveraging entities to improve retrieval effectiveness and robustness. Our proposed approaches are based on the integration of related entities of queries and the entity model from knowledge base to the retrieval model. The first approach is called as entity-centric query expansion, in which we integrate the related entities into the original query model to perform query expansion. Documents are then retrieved based on the expanded query model. In the second approach, we leverage the publicly available Freebase annotation on ClueWeb12 as well as Freebase API to estimate the entity model. It is called Latent Entity Space (LES), in which we model the relevance between query and document in a latent space. Each dimension of the latent space is represented by an entity and the query-document relevance is estimated based on their projections to each dimension. The evaluation results on ad hoc task show that entities can indeed bring further improvements on the performance of Web document retrieval when combined with axiomatic retrieval model with semantic expansion, one of the state-of- the-art methods. Furthermore, results on risk-sensitive task demonstrate that our proposed model also have advantage on minimizing the retrieval risk.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Nov 01, 2014
Accession Number
ADA618586

Entities

People

  • Hui Fang
  • Peiling Yang
  • Xitong Liu

Organizations

  • University of Delaware

Tags

DTIC Thesaurus Topics

  • Abstracts
  • American Revolution
  • Delaware
  • Equations
  • Extraction
  • Information Operations
  • Language
  • Maryland
  • Maximum Likelihood Estimation
  • Named Entity Recognition
  • North America
  • Probability
  • Random Variables
  • Space Based
  • Standards
  • United States

Fields of Study

  • Computer science

Readers

  • Computational Linguistics
  • Information Retrieval
  • Systems Analysis and Design

Technology Areas

  • Space