Browsing, Discovery and Search in Large Distributed Databases of Complex and Scanned Documents.

Abstract

This project aims to integrate powerful, new techniques for interactive browsing, discovery and retrieval in very large, distributed databases of complex and scanned documents. Emphasis is placed on going beyond full-text retrieval techniques developed in the DARPA TIPSTER program to support different types of access and non-textual content. These techniques should be particularly relevant to the patent domain where it is important to find relationships between documents and where the patent or trademark may be based on a visual design. The specific tasks identified involve studying representation techniques for long documents with complex structure, browsing and discovery techniques for large text databases, image retrieval and scanned document retrieval techniques, and architectures for large, distributed databases.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Oct 05, 1999
Accession Number
ADA368979

Entities

People

  • W. Bruce Croft

Organizations

  • University of Massachusetts Amherst

Tags

Communities of Interest

  • Materials and Manufacturing Processes

DTIC Thesaurus Topics

  • Algorithms
  • Artificial Intelligence
  • Classification
  • Computer Science
  • Data Analysis
  • Databases
  • Demonstrations
  • Frequency Domain
  • Information Processing
  • Information Retrieval
  • Judgment
  • Language
  • Models
  • Sampling
  • Trademarks
  • United States
  • Visualizations

Fields of Study

  • Computer science

Readers

  • Artificial Intelligence
  • Computational Linguistics
  • Computer Vision.