High Productivity MPI - Grid, Multi-Cluster, and Embedded System Extensions

Abstract

High Productivity MPI is an approach to extending MPI to support multiple implementations (IMPI, IMPI-2), owner domains, architectures, networks, operating systems, faults, and interacting dynamic groups without relying on a two-level implementation (one MPI implementation calling another). MPI implementations must be able to connect, reconnect, and work well with dynamic, intermittent resources, under the expectation that user applications will also become somewhat fault-aware in order to retain scalability. This paper addresses the many concerns that arise in offering composable sessions in which multiple-vendor MPIs can be supported (starting from but not ending with IMPI protocol). Experiences with IMPI, and a new proposal, IMPI-2, are offered. This paper addresses specific issues about interoperating the gamut of MPI-2 services in the interoperable setting, which to our knowledge have not been addressed elsewhere. The results of this work are open specifications, together with our own vendor-implementation of these MPI capabilities. Other open and commercial MPIs could adopt IMPI plus these other extensions in order to participate in the hierarchical, heterogeneous, grid computing settings, without mandating new MPI implementations in such settings. The author's goal in offering these new protocols as proposals is to encourage the High Productivity Computer world to enter into significant discussions about their adoption. The goal is to offer these capabilities without mandating grid-computing infrastructure.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Sep 30, 2004
Accession Number
ADA433252

Entities

People

  • Anthony Skjellum
  • Kumaran Rajaram
  • Pirabhu Raman
  • Puri Banglore
  • Rossen Dimitrov

Tags

Communities of Interest

  • Materials and Manufacturing Processes

DTIC Thesaurus Topics

  • Abstracts
  • Bandwidth
  • Best Practices
  • Case Studies
  • Computing System Architectures
  • Electronic Mail
  • Embedded Systems
  • Fault Tolerance
  • Information Operations
  • Infrastructure
  • Network Protocols
  • Operating Systems
  • Parallel Computing
  • Productivity
  • Standards
  • Symposia
  • Teamwork

Fields of Study

  • Computer science

Readers

  • Distributed Systems and Data Platform Development
  • Economics
  • Parallel and Distributed Computing.