Superimposed Coding Versus Sequential and Inverted Files.
Abstract
The relative efficiency of three computer search algorithms was compared for searching large bibliographic files with Boolean search strategies. The sequential and inverted files represent the two most common file structures used today for bibliographic searching. Superimposed coding is an alternative that is becoming more attractive as the speed of computers improves. The superimposed search has a key associated with each record in the data base to act as a screen to eliminate the majority of records from further consideration. The keys are based on the bigrams and trigrams contained in the record, and are arranged in a linear file. The sequential search is a character by character scan of the entire file. This search is facilitated by constructing a finite state machine at the beginning of the search to match the search terms. The inverted file is fairly standard, except for the use of bit vectors to hold the postings of very common entries. A data base of 100,000 INSPEC records, from nine months of 1974, was used for testing the algorithms with 339 real-life search questions.
Document Details
- Document Type
- Technical Report
- Publication Date
- Mar 01, 1977
- Accession Number
- ADA040685
Entities
People
- Thomas Butler Hickey
Organizations
- University of Illinois Urbana–Champaign