Superimposed Coding Versus Sequential and Inverted Files.

Abstract

The relative efficiency of three computer search algorithms was compared for searching large bibliographic files with Boolean search strategies. The sequential and inverted files represent the two most common file structures used today for bibliographic searching. Superimposed coding is an alternative that is becoming more attractive as the speed of computers improves. The superimposed search has a key associated with each record in the data base to act as a screen to eliminate the majority of records from further consideration. The keys are based on the bigrams and trigrams contained in the record, and are arranged in a linear file. The sequential search is a character by character scan of the entire file. This search is facilitated by constructing a finite state machine at the beginning of the search to match the search terms. The inverted file is fairly standard, except for the use of bit vectors to hold the postings of very common entries. A data base of 100,000 INSPEC records, from nine months of 1974, was used for testing the algorithms with 339 real-life search questions.

Open PDF

Document Details

Document Type: Technical Report
Publication Date: Mar 01, 1977
Accession Number: ADA040685

Entities

People

Thomas Butler Hickey

Organizations

University of Illinois Urbana–Champaign

Superimposed Coding Versus Sequential and Inverted Files.

Abstract

Document Details

Entities

People

Organizations

Tags

Communities of Interest

DTIC Thesaurus Topics

Readers