Sorting by Recursive Partitioning,

Abstract

This document presents a new O(n lg lg n) time sort algorithm that is more robust than O(n) distribution sorting algorithms. The algorithm uses a recursive partition-concatenate approach, partitioning each set into a variable number of subsets using information gathered dynamically during execution. Sequences are partitioned using statistical information computed during the sort for each sequence. Space complexity is O(n) and is independent from the order and distribution of the data. If the data is originally in a list, only O(square root of (n) extra space is necessary. The algorithm is insensitive to the initial ordering of the data, and it is much less sensitive to the distribution of the values of the sorting keys than distribution sorting algorithms. Its worst-case time is O(n lg lg n) across all distributions that satisfy a new fractalness criterion. This condition, which is sufficient but not necessary, is satisfied by any set with bounded length keys and bounded repetition of each key. If this condition is not satisfied, its worst case performance degrades gracefully to O(n lg n). In practice, this occurs when the density of the distribution over omega(n) of the keys is a fractal curve (for sets of numbers whose values are bounded), or when the distribution has very heavy tails with arbitrarily long keys (for sets of numbers whose precision is bounded). In some preliminary tests, it was faster than Quicksort for sets of more than 150 elements. The algorithm is practical, works basically in place, can be easily implemented and is particularly well suited both for parallel processing and for external sorting. (Author)

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Dec 01, 1983
Accession Number
ADA141583

Entities

People

  • D. M. Chapiro

Organizations

  • Stanford University

Tags

Communities of Interest

  • Materials and Manufacturing Processes

DTIC Thesaurus Topics

  • Algorithms
  • Artificial Intelligence
  • Computer Programming
  • Computer Science
  • Computers
  • Distribution Curves
  • Distribution Functions
  • Equations
  • Gaussian Distributions
  • Information Processing
  • Information Science
  • Information Transfer
  • Normal Distribution
  • Parallel Computing
  • Parallel Processing
  • Probability
  • Statistical Analysis

Fields of Study

  • Computer science

Readers

  • Applied Combinatorial Optimization and Logic Circuit Design.
  • Graph Algorithms and Convex Optimization.
  • Regression Analysis.

Technology Areas

  • Space