University of Glasgow at TREC 2008: Experiments in Blog, Enterprise, and Relevance Feedback Tracks with Terrier
Abstract
In TREC 2008, we participate in the Blog, Enterprise, and Relevance Feedback tracks. In all tracks, we continue the research and development of the Terrier platform centred around extending state-of-the-art weighting models based on the Divergence From Randomness (DFR) framework. In particular, we investigate two main themes, namely, proximity-based models, and collection and profile enrichment techniques based on several resources. In the Blog track, we aim to improve our opinion detection techniques and to integrate various new blog-specific features into our Voting Model. For the baseline ad-hoc task, we aim to build strongly performing baselines by applying two different techniques. The first one boosts documents in which query terms co-occur in a given window size, and the second one applies query expansion using collection enrichment. Non-English documents are also removed from the retrieved results. In the opinion-finding task, we experiment with two main opinion detection approaches. The first one improves our TREC 2007 dictionary-based approach by automatically building an internal opinion dictionary from the collection itself. We measure the opinionated discriminability of each term using an information-theoretic divergence measure based on the relevance assessments of previous years. The second approach is based on the OpinionFinder tool, which identifies subjective sentences in text. In particular, we introduce a novel method to measure the informativeness of query terms occurring in close proximity to subjective sentences. In the blog distillation task, we have two research themes.
Document Details
- Document Type
- Technical Report
- Publication Date
- Nov 01, 2008
- Accession Number
- ADA512687
Entities
People
- Ben He
- Craig Macdonald
- Iadh Ounis
- Jie Peng
- Rodrygo L. Santos
Organizations
- University of Glasgow