CNIPA, FUB and University of Rome "Tor Vergata" at TREC 2008 Legal Track

Abstract

The TREC Legal track was introduced in TREC 2006with the claimed purpose of to evaluate the efficacy of automated support for review and production of electronic records in the context of litigation, regulation and legislation. The TREC Legal track 2008 runs three tasks: (1) an automatic ad hoc task, (2) an automatic relevance feedback task, and (3) an interactive task. We have only taken part in the automatic ad hoc task of the TREC Legal track 2008, and focused on the following issues: 1. Indexing. The CDIP test collection is characterized by an large number of unique terms due to OCR mistakes. We have defined a term selection strategy to reduce the number of terms, as described in Section 2. 2. Querying. The analysis of the past TREC results for the Legal track showed that the best retrieval strategy basically returned a ranked list of the boolean retrieved documents. As a consequence,we have defined a strategy aimed to boost the score of documents satisfying the final negotiated boolean query. Furthermore, we defined a method for automatic construction of a weighted query from the request text, as reported in Section 3. 3. Estimation of the K value.We have used a query performance prediction approach to try to estimate K values. The query weighting model that we have adopted is described in Section 4.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Nov 01, 2008
Accession Number
ADA512716

Entities

People

  • Alessandro Celi
  • Giambattista Amati
  • Giorgio Gambosi
  • Giovanni Stilo
  • Marco Bianchi
  • Mauro Draoli

Organizations

  • Fondazione Ugo Bordoni

Tags

DTIC Thesaurus Topics

  • Abstracts
  • Automatic
  • Computer Science
  • Data Science
  • Discriminant Analysis
  • Information Operations
  • Information Processing
  • Information Retrieval
  • Information Science
  • Law
  • Public Administration
  • Standards
  • Statistics
  • Universities
  • Very Low Frequency

Readers

  • Computational Linguistics
  • Information Retrieval

Technology Areas

  • Microelectronics