Efficient coflow scheduling with Varys

Abstract

Communication in data-parallel applications often involves a collection of parallel flows. Traditional techniques to optimize flow-level metrics do not perform well in optimizing such collections, because the network is largely agnostic to application-level requirements. The recently proposed coflow abstraction bridges this gap and creates new opportunities for network scheduling. In this paper, we address inter-coflow scheduling for two different objectives: decreasing communication time of data-intensive jobs and guaranteeing predictable communication time. We introduce the concurrent open shop scheduling with coupled resources problem, analyze its complexity, and propose effective heuristics to optimize either objective. We present Varys, a system that enables data-intensive frameworks to use coflows and the proposed algorithms while maintaining high network utilization and guaranteeing starvation freedom. EC2 deployments and trace-driven simulations show that communication stages complete up to 3.16X faster on average and up to 2X more coflows meet their deadlines using Varys in comparison to per-flow mechanisms. Moreover, Varys outperforms non-preemptive coflow schedulers by more than 5X.

Document Details

Document Type
Pub Defense Publication
Publication Date
Aug 17, 2014
Source ID
10.1145/2740070.2626315

Entities

People

  • Ion Stoica
  • Mosharaf Chowdhury
  • Yuan Zhong

Organizations

  • Amazon Web Services
  • Columbia University
  • Defense Advanced Research Projects Agency
  • Division of Computing and Communication Foundations
  • University of California, Berkeley

Tags

Fields of Study

  • Computer science

Readers

  • Computer Networking
  • Distributed Systems and Data Platform Development
  • Parallel and Distributed Computing.