A Software Architecture for Scalable Simulations on Heterogeneous Networks

Abstract

Our initial research since the inception of this project has established the viability and effectiveness of each of the novel aspects of our approach to massively concurrent computing viz. specialized communications substrate, threads based programming, semi-automatic domain specific parallelization, and failure resilient computing. Experiences with a few, but nonetheless representative, classes of applications has demonstrated the efficacy of our prototype systems, and have established the justifications for building robust production quality versions thereof. In the process, our work has also highlighted several interesting research issues from the computer systems and computer science points of view. We have focused on four layers of a software architecture: domain layers (ParaSol and EcliPSe) and support layers (Ariadne - for threads, and Conch/Clam for communication). The architecture uses a two-level control hierarchy, exploiting locality and new (UDP-based) network protocols to reduce communication times, enhance functionality and improve performance. The modification involves designating specific processes within the host pool as sub-domain servers(SDS). The first process to be initiated on each subnet automatically instantiates a thread that serves as the primary SDS for that subset of the host pool, with subsequent processes assuming secondary responsibility in the case of failures. Each SDS is responsible for process and thread creation/destruction on its machines, as well as for collecting load and resource availability statistics. By exchanging control messages among themselves, the SDS threads provide hints to the higher layers (EcliPSe, Anadne and ParaSol) to enable workload allocation based on balanced distribution of computation and on minimized communication across subnet boundaries.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Sep 01, 2001
Accession Number
ADA396043

Entities

People

  • V. Sunderam
  • Vernon Rego

Organizations

  • Purdue University

Tags

DTIC Thesaurus Topics

  • Classification
  • Communication Systems
  • Computations
  • Computer Programming
  • Computer Science
  • Computers
  • Computing System Architectures
  • High Performance Computing
  • Network Protocols
  • Parallel Computing
  • Parallel Processing
  • Prototypes
  • Simulations
  • Software Design
  • Software Development
  • Substrates
  • Workload

Fields of Study

  • Computer science

Readers

  • Astronomy/Astrophysics
  • Neural Network Machine Learning.
  • Software Engineering.