Performance Implications of Synchronization Support for Parallel FORTRAN Programs

Abstract

This paper studies the performance implications of architectural synchronization support for automatically parallelized numerical programs. As the basis for this work, we analyze the needs for synchronization in automatically parallelized numerical programs. The needs are due to task management, loop scheduling, barriers, and data dependency handling. We present synchronization algorithms for efficient execution of programs with nested parallel loops. Next, we identify how various hardware synchronization primitives can be used to satisfy these software synchronization needs. The synchronization primitives studied are test and set, fetch and add, exchange- byte and synchronization bus implementation of lock/unlock operations. Lastly, we ran experiments to quantify the impact of various architectural support on the performance of a bus-based shared memory multiprocessor running automatically parallelized numerical programs. We found that supporting an atomic fetch and add primitive in shared memory is as effective as supporting lock/unlock operations with a synchronization bus. Both achieve substantial performance improvement over the cases where atomic test and set and exchange- byte operations are supported in shared memory.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jun 17, 1991
Accession Number
ADA238493

Entities

People

  • Sadun Anik
  • Wen-mei Hwu

Organizations

  • University of Illinois Urbana–Champaign

Tags

Communities of Interest

  • Space

DTIC Thesaurus Topics

  • Application Software
  • Biological Sciences
  • Classification
  • Computer Programming
  • Computer Programs
  • Computers
  • Contracts
  • Engineering
  • Fluid Dynamics
  • High Performance Computing
  • Illinois
  • Parallel Computing
  • Parallel Processing
  • Security
  • Simulations
  • Simulators
  • Universities

Fields of Study

  • Computer science
  • Engineering

Readers

  • Parallel and Distributed Computing.