A Machine Learning Approach to Inductive Query by Examples: An Experiment Using Relevance Feedback, ID3, Genetic Algorithms, and Simulated Annealing

Abstract

Information retrieval using probabilistic techniques has attracted significant attention on the part of researchers in information and computer science over the past few decades. In the 1980s, knowledge-based techniques also made an impressive contribution to "intelligent" information retrieval and indexing. More recently, information science researchers have turned to other newer inductive learning techniques including symbolic learning, genetic algorithms, and simulated annealing. These newer techniques, which are grounded in diverse paradigms, have provided great opportunities for researchers to enhance the information processing and retrieval capabilities of current information systems. In this article, we first provide an overview of these newer techniques and their use in information systems. In this article, we first provide an overview of these newer techniques and their use in information retrieval research. In order to familiarize readers with the techniques, we present three promising methods: The symbolic ID3 algorithm, evolution-based genetic algorithms, and simulated annealing. We discuss their knowledge representations and algorithms in the unique context of information retrieval. An experiment using a 8000-record COMPEN database was performed to examine the performances of these inductive query-by-example techniques in comparison with the performance of the conventional relevance feedback method. The machine learning techniques were shown to be able to help identify new documents which are similar to documents initially suggested by users, and documents which contain similar concepts to each other. Genetic algorithms, in particular, were found to out-perform relevance feedback in both document recall and precision.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jun 01, 1998
Accession Number
ADA573988

Entities

People

  • Anand Lyer
  • Ganesan Shankaranarayanan
  • Hsinchun Chen
  • Linlin She

Organizations

  • University of Arizona

Tags

Communities of Interest

  • Autonomy
  • Materials and Manufacturing Processes

DTIC Thesaurus Topics

  • Algorithms
  • Artificial Intelligence
  • Cognition
  • Computational Science
  • Computer Science
  • Computers
  • Data Science
  • Databases
  • Genetic Algorithms
  • Information Processing
  • Information Retrieval
  • Information Science
  • Information Systems
  • Machine Learning
  • Neural Networks
  • Probabilistic Models
  • Probability

Fields of Study

  • Computer science

Readers

  • Artificial Intelligence
  • Geospatial Intelligence and Artificial Intelligence Analytics
  • Systems Analysis and Design

Technology Areas

  • AI & ML
  • AI & ML - Information Retrieval
  • AI & ML - Neural Networks
  • Biotechnology