Interactive Information Organization: Techniques and Evaluation
Abstract
The explosive growth of digital information available on-line and the ubiquity of the Internet require the development of effective techniques for information search and access. Locating interesting information on the World Wide Web is the main task of on-line search engines. Such engines accept a query from a user and respond with a list of documents or web pages that are considered to be relevant to the query. The pages are ranked by their likelihood of being relevant to the user's request. The majority of today's Web search engines follow this scenario. The ordering of documents in the ranked list is simple and intuitive. The user is expected to follow the list while examining the retrieved documents. In practice, browsing the ranked list is rather tedious and often unproductive. Existing evidence shows that users quite often stop and do not venture beyond the first screen of results or the top 10 retrieved documents. In this thesis, the author studies alternative document organization techniques that can help users find relevant information in the retrieved data much more quickly than with a ranked list. He introduces a novel evaluation approach that is based in part on modeling system-user interaction. It allows one to separate the user's effect on the overall performance from the system's qualities. He applies this evaluation method to two different document organization techniques. The first technique uses a clustering algorithm to partition the document set into well-defined groups. The second system applies a multidimensional scaling algorithm called spring-embedding to represent documents as objects in space arranged in proportion to inter-document similarity. The results show that both systems can be used much more effectively than the ranked list approach. The author uses a reinforcement learning algorithm to build a "wizard" tool that helps the user navigate the system. This wizard provides better support than traditional relevance feedback.
Document Details
- Document Type
- Technical Report
- Publication Date
- May 01, 2001
- Accession Number
- ADA441132
Entities
People
- Anton Leuski
Organizations
- University of Massachusetts Amherst