SIMFINDER: A Flexible Clustering Tool for Summarization

Abstract

We present a statistical similarity measuring and clustering tool, SIMFINDER, that organizes small pieces of text from one or multiple documents into tight clusters. By placing highly related text units in the same cluster, SIMFINDER enables a subsequent content selection/generation component to reduce each cluster to a single sentence, either by extraction or by reformulation. We report on improvements in the similarity and clustering components of SIMFINDER, including a quantitative evaluation, and establish the generality of the approach by interfacing SIMFINDER to two very different summarization systems.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jun 07, 2001
Accession Number
AD1027040

Entities

People

  • Judith L. Klavans
  • Kathleen R. Mckeown
  • Melissa L. Holcombe
  • Min-yen Kan
  • Regina Barzilay
  • Vasileios Hatzivassiloglou

Organizations

  • Columbia University

Tags

Communities of Interest

  • Autonomy
  • Biomedical
  • Engineered Resilient Systems

DTIC Thesaurus Topics

  • Artificial Intelligence
  • Artificial Intelligence Software
  • Automata Theory
  • Automated Text Summarization
  • Computational Linguistics
  • Computational Science
  • Computer Languages
  • Computer Science
  • Human Rights
  • Information Processing
  • Information Retrieval
  • Language
  • Linguistics
  • Machine Learning
  • Natural Language Processing
  • Natural Languages
  • Precision

Fields of Study

  • Computer science

Readers

  • Aerial Delivery - Logistics and Supply Chain Management.
  • Computational Linguistics
  • Neural Network Machine Learning.