Resource Allocation in Massively Heterogeneous Computer Systems: A Distributed Approval
Abstract
Major Goals: The goal of this project is to develop distributed placement and scheduling algorithms for heterogeneous computing jobs run across a network of heterogeneous computing devices. We characterize heterogeneity as follows. Computing jobs may arrive in the system at different times and are characterized by their resource requirements, which may encompass multiple types of resources, e.g., requirements for both compute power and memory. Devices, in turn, are characterized by their heterogeneous resource availability, e.g., providing different amounts of CPU or GPU resources, memory, etc. These devices may even have different types of computing paradigms, e.g., CPUs compared to GPUs, and will have various amounts of these resources available at different times. Our algorithms to match jobs to providers over time should consider heterogeneity of both devices and jobs, and are designed to scale to the potentially massive number of jobs and devices present. While centralized matching algorithms allow users to easily coordinate their assignment of jobs to users, they may not scale well to massive numbers of jobs and devices. Thus, we focus on distributed algorithms that empower users and devices to find a mutually satisfying matching that meets job needs within device resource constraints. In particular, our framework is based on distributed pricing algorithms, in which devices announce virtual prices for their resources and users attempt to allocate their jobs to resources so as to incur the lowest cost. These prices indicate the capacity limitations of each device relative to users demands for them, and thus serve as a means for users to indirectly coordinate their job scheduling and placement. Thus, it requires little exchange of knowledge between devices and users; devices set the prices based on resource availability and users react based on their job requirements.
Document Details
- Document Type
- Technical Report
- Publication Date
- Jun 06, 2022
- Accession Number
- AD1210623
Entities
People
- Carlee Joe-Wong
Organizations
- Carnegie Mellon University