The Hybrid Tree: An Index Structure for High Dimensional Feature Spaces (Preprint)

Abstract

Feature based similarity search is emerging as an important search paradigm in database systems. The technique used is to map the data items as points into a high dimensional feature space which is indexed using a multidimensional data structure. Similarity search then corresponds to a range search over the data structure. Although several data structures have been proposed for feature indexing, none of them is known to scale beyond 10-15 dimensional spaces. This paper introduces the hybrid tree a multidimensional data structure for indexing high dimensional feature spaces. Unlike other multidimensional data structures, the hybrid tree cannot be classified as either a pure data partitioning (DP) index structure (e.g., R-tree, SS-tree, SRtree) or a pure space partitioning (SP) one (e.g., KDB-tree, hBtree); rather, it combines positive aspects of the two types of index structures a single data structure to achieve search performance more scalable to high dimensionalities than either of the above techniques (hence, the name hybrid ). Furthermore, unlike many data structures (e.g., distance based index structures like SS-tree, SR-tree), the hybrid tree can support queries based on arbitrary distance functions. Our experiments on real high dimensional large size feature databases demonstrate that the hybrid tree scales well to high dimensionality and large database sizes. It significantly outperforms both purely DP-based and SP-based index mechanisms as well as linear scan at all dimensionalities for large sized databases.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 01, 1999
Accession Number
ADA466131

Entities

People

  • Kaushik Chakrabarti
  • Sharad Mehrotra

Organizations

  • University of Illinois Urbana–Champaign

Tags

Communities of Interest

  • Air Platforms

DTIC Thesaurus Topics

  • Algorithms
  • Boundaries
  • Classification
  • Coding
  • Computer Science
  • Databases
  • Decoding
  • Dimensionality Reduction
  • Fourier Transformation
  • Guarantees
  • Information Operations
  • Optimization
  • Probability
  • Probability Distributions
  • Scalability
  • Splitting
  • Trees (Data Structures)

Fields of Study

  • Computer science

Readers

  • Computer Vision.
  • Neural Network Machine Learning.

Technology Areas

  • Space