Cluster Scheduling for Explicitly-Speculative Tasks

Abstract

A process schedule on a shared cluster, grid, or supercomputer that is informed which submitted tasks are possibly unneeded speculative tasks can use this knowledge to better support increasingly prevalent user work habits, lowering user-visible response time, lowering user costs, and increasing resource provider revenue. Large-scale computing often consists of many speculative tasks (tasks that may be canceled) to test hypotheses, search for insights, and review potentially finished products. For example, speculative tasks are issued by bioinformaticists comparing DNA sequences, computer graphics artists rendering scenes, and computer researchers studying caching. This behavior - exploratory searches and parameter studies, made more common by the cost-effectiveness of cluster computing - on existing schedulers without speculative task support results in a mismatch of goals and suboptimal scheduling. Users wish to reduce results in a mismatch of goals and suboptimal scheduling. Users wish to reduce their time waiting for needed task output and the amount they will be charged for unneeded speculation, making it unclear to the user how many speculative tasks they should submit. This thesis introduces 'batchactive' scheduling (combining batch and interactive characteristics) to exploit the inherent speculation in common application scenarios. With a bachactive scheduler, users submit explicityly-labeled batches of speculative tasks exploring ambitious lines of inquiry, and users interactively request task outputs when these outputs are found to be needed. After receiving and considering an output for some time, a user decides whether to request more outputs, cancel tasks, or disclose new speculative tasks.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Dec 01, 2004
Accession Number
ADA490320

Entities

People

  • David Petrou

Organizations

  • Carnegie Mellon University

Tags

Communities of Interest

  • Advanced Electronics
  • Autonomy
  • Cyber
  • Energy and Power Technologies

DTIC Thesaurus Topics

  • Computer Architecture
  • Computer Graphics
  • Computer Languages
  • Computer Programming
  • Computer Programs
  • Computer Science
  • Computers
  • High Performance Computing
  • Information Science
  • Information Systems
  • Machine Learning
  • Operating Systems
  • Operations Research
  • Parallel Computing
  • Parallel Processing
  • Software Development
  • Systems Engineering

Fields of Study

  • Computer science

Readers

  • Database Systems and Applications
  • Government Contracting/Procurement.
  • Parallel and Distributed Computing.