Scheduling for Locality in Shared-Memory Multiprocessors
Abstract
The last decade has produced enormous improvements in processor speed without a corresponding improvement in bus or interconnection network speeds. As a result, the relative costs of communication and computation in shared-memory multiprocessors have changed dramatically, and many parallel applications do not execute efficiently on today's multiprocessors. In this dissertation we quantify the effect of this trend-in architecture on parallel program performance, explain the implications of this trend on popular parallel programming models, and propose system software to efficiently map parallel programs and programming models to modern shared-memory multiprocessors. We propose new decomposition and scheduling algorithms that significantly reduce communication overhead. Our experiments over a wide variety of shared-memory multiprocessors demonstrate that the performance benefits of our scheduling-for-locality algorithms are significant, improving performance by up to 60% for some applications. We conclude that communication overhead need not dominate performance, given an appropriate programming model, multiprogramming scheduling policy, and user- level decomposition and scheduling algorithms. Shared-memory multiprocessors, Architecture trends, Loop scheduling, Lightweight thread scheduling, Multiprogramming
Document Details
- Document Type
- Technical Report
- Publication Date
- May 01, 1993
- Accession Number
- ADA272948
Entities
People
- Evangelos Markatos
Organizations
- University of Rochester