Fault-Tolerance and Efficiency in Massively Parallel Algorithms

Abstract

We present an overview of massively parallel deterministic algorithms which combine high fault-tolerance and efficiency. This desirable combination (called robustness here) is nontrivial, since increasing efficiency implies removing redundancy whereas increasing fault-tolerance requires adding redundancy to computations. We study a spectrum of algorithmic models for which significant robustness is achievable, from static fault, synchronous computation to dynamic fault, asynchronous computation. In addition to fail-stop processor models. we examine and deal with arbitrarily initialized memory and restricted memory access concurrency. We survey the deterministic upper bounds for the basic Write-All primitive, the lower bounds on its efficiency, and we identify some of the key open questions. We also generalize the robust computing of functions to relations; this new approach can model approximate computations. We show how to compute approximate Write-All optimally. Finally, we synthesize the state-of-the-art in a complexity classification, which extends with fault- tolerance the traditional classification of efficient parallel algorithms

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 01, 1993
Accession Number
ADA283255

Entities

People

  • Alex A. Shvartsman
  • Paris O. Kanellakis

Organizations

  • Brown University

Tags

Communities of Interest

  • Energy and Power Technologies

DTIC Thesaurus Topics

  • Algorithms
  • Asynchronous Computation
  • Classification
  • Computations
  • Computer Science
  • Demographic Cohorts
  • Efficiency
  • Fault Tolerance
  • Instructions
  • Iterations
  • Measurement
  • Multithreading
  • Parallel Computing
  • Permutations
  • Simulations
  • Survivability
  • Trees (Data Structures)

Fields of Study

  • Computer science

Readers

  • Applied Combinatorial Optimization and Logic Circuit Design.
  • Neural Network Machine Learning.