Text Summarization Evaluation: Correlating Human Performance on an Extrinsic Task with Automatic Intrinsic Metrics

Abstract

This research describes two types of summarization evaluation methods, intrinsic and extrinsic, and concentrates on determining the level of correlation between automatic intrinsic methods and human task-based extrinsic evaluation performance. Suggested experiments and preliminary findings related to exploring correlations and factors affecting correlation (method of summarization, quality of summary, type of intrinsic method used, and genre of source documents) are detailed. A new measurement technique for task-based evaluations, Relevance Prediction, is introduced and contrasted with the current gold-standard based measurements of the summarization evaluation community. Preliminary experimental findings suggest that the Relevance Prediction method yields better performance measurements with human summaries than that of the LDC-Agreement method and that small correlations are seen with one of the automatic intrinsic evaluation metrics and human task-based performance results.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
May 01, 2006
Accession Number
ADA455670

Entities

People

  • Bonnie J. Dorr
  • Stacy F. President

Organizations

  • University of Maryland

Tags

Communities of Interest

  • Energy and Power Technologies
  • Human Systems

DTIC Thesaurus Topics

  • Accuracy
  • Automated Text Summarization
  • Computational Linguistics
  • Computational Science
  • Information Processing
  • Information Retrieval
  • Information Science
  • Language
  • Law
  • Linguistics
  • Machine Translation
  • Motor Skills
  • Natural Language Processing
  • Psychology
  • Test And Evaluation
  • United States
  • Web Browsers

Readers

  • Computational Linguistics
  • Instructional Design and Training Evaluation.
  • Materials Science and Engineering.