TREC2001 Question-Answer, Web and Cross Language Experiments Using PIRCS
Abstract
We applied our PIRCS system for the Question-Answer, ad-hoc Web retrieval using the 10-GB collection, and the English-Arabic cross language tracks. We also attempted to complete the adaptive filtering experiments with our upgraded programs but found that we did not have sufficient time to do so. The QA Track requires obtaining 50-byte answer strings to 500 questions (later truncated to 492). The answers are to be retrieved from documents made up from the TREC collections: AP1-3, WSJ1-2, SJMN-3, FT-4, LA-5 and FBIS-5. Our QA system is constructed using methods of classical IR, enhanced with simple heuristics. It does not have natural language understanding capabilities, but employs simple pattern matching and statistics. We view QA as a three-step process: (1) retrieving a set of documents that are highly related to the topic of the question; (2) weighing sentences in this document set that are most likely to answer the question according to the query type and its description; and (3) selecting words from the top-scoring sentences to form the answer string. This approach was quite successful for the 250-byte answer task at TREC-9. This year we added more heuristics, better pattern recognition and entity recognition.
Document Details
- Document Type
- Technical Report
- Publication Date
- Jan 01, 2006
- Accession Number
- ADA456272
Entities
People
- K. L. Kwok
- L. Grunfeld
- M. Chan
- N. Dinstl
Organizations
- Queens College