Controlling Memory Access Concurrency in Efficient Fault-Tolerant Parallel Algorithms,
Abstract
The CRCW PRAM under dynamic fail-stop (no restart) processor behavior is a fault-prone multiprocessor model for which it is possible to both guarantee reliability and preserve efficiency. To handle dynamic faults some redundancy is necessary in the form of many processors concurrently performing a common read or write task. In this paper we show how to significantly decrease this concurrency by bounding it in terms of the number of actual processor faults. We describe a low concurrency, efficient and fault-tolerant algorithm for the Write-All primitive: 'using less than or equal to N processors, write 1's into N locations'. This primitive can serve as the basis for efficient fault-tolerant simulations of algorithms written for fault-free PRAMs on fault-prone PRAMs. For any dynamic failure pattern F, our algorithm has total write concurrency less than or equal to /F/ and total read concurrency less than or equal to 7/F/log N, where /F/ is the number of processor faults (for example, there is no concurrency in a run without failures); note that, previous algorithms used Omega(N log N) concurrency even in the absence of faults. We also describe a technique for limiting the per step concurrency and present an optimal fault- tolerant EREW PRAM algorithm for Write-All, when all processor faults are initial.
Document Details
- Document Type
- Technical Report
- Publication Date
- May 16, 1994
- Accession Number
- ADA280909
Entities
People
- Alex A. Shvartsman
- Dimitrios Michailidis
- Paris C. Kanellakis
Organizations
- Brown University