Adaptive Resource Management for Deployable HPC Systems

Abstract

Project goals were to develop techniques of continual reallocation of resources to maintain application performance despite statically unpredictable change in resource demands. Research was targeted to multiple application systems executing on HPC (High Performance Computing) platforms. This project built on the results of a previous program, called Adaptive Resource Allocation (ARA). In ARA, Honeywell developed techniques for dynamic reallocation of resources to single parallel applications, structured as multi-pipelines, executing on a high performance parallel machine. They extended ARA results to systems with multiple applications and multiple machines connected over a network. In October 1997 DARPA merged the technical effort on this project with the RTARM project funded under Quorum. This did not affect the core statement of work for ARM, but led to extension of its completion date. ARM focused on developing an approach based on adaption models, and addressed best-effort resource allocation in an environment with partitionable rather than shared resources. Parallel HPG platforms were de-emphasized in favor of general distributed computing platforms. Results from ARM are being integrated into RTARM. The layered architecture of ARM has given way to a hierarchical architecture characterized by uniformity across different levels. The MPI-based communication infrastructure in ARM has given way to a CORBA ORB infrastructure. While ARM implementation was targeted to Unix machine connected over Ethernet, the target platform for PTARM consists of Windows NT machines networked over ATM.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Apr 01, 2000
Accession Number
ADA378149

Entities

People

  • Rakesh Jha

Organizations

  • Honeywell International, Inc.

Tags

Communities of Interest

  • C4I
  • Human Systems
  • Materials and Manufacturing Processes
  • Sensors
  • Weapons Technologies

DTIC Thesaurus Topics

  • Air Force Research Laboratories
  • Change Detection
  • Climate Change
  • Computer Programming
  • Computers
  • Control Systems
  • Detectors
  • Distributed Computing
  • Embedded Systems
  • High Performance Computing
  • Information Exchange
  • Information Systems
  • Network Protocols
  • Operating Systems
  • Probability
  • Resource Management
  • Target Recognition

Fields of Study

  • Computer science
  • Engineering

Readers

  • Parallel and Distributed Computing.
  • Software Engineering