EXPERIMENTAL DESIGN FOR MEASURING THE INTRA- AND INTER-GROUP CONSISTENCY OF HUMAN JUDGMENT OF RELEVANCE.

Abstract

The suspected variability of humans in judging the relevance of documents is one of the current problems confronting the development and improvement of document information and retrieval systems. The purpose of this thesis was to design a method to investigate the variation of relevance judgments between two groups of analysts and among the analysts within each group. A pilot experiment was conducted using two groups of analysts (subject experts and non-experts) and two question-document collections (machine retrieved and randomly selected). Analysts were instructed to mark each document relevant or not-relevant to the given question and to record the time required to make such relevance assessments. The responses were analyzed statistically. The data permitted the following conclusions: (1) the analysts within the groups could consistently agree on the relevance of documents to questions; (2) the degree of consistency of the two groups did not differ significantly; (3) the two groups did agree on the relevance of a particular document to a question; and (4) the method of document selection had a serious effect only on the consistency of the group of non-experts.

Document Details

Document Type
Technical Report
Publication Date
Aug 01, 1965
Accession Number
AD0620342

Entities

People

  • John Marion Hoffman

Organizations

  • Georgia Tech

Tags

DTIC Thesaurus Topics

  • Consistency
  • Data Science
  • Experimental Design
  • Information Science
  • Judgment

Readers

  • Information Retrieval
  • Regression Analysis.
  • Team-Based Human-Centered Cognitive Task Decision Making and Information Performance.