Work stealing for interactive services to meet target latency

Abstract

Interactive web services increasingly drive critical business workloads such as search, advertising, games, shopping, and finance. Whereas optimizing parallel programs and distributed server systems have historically focused on average latency and throughput, the primary metric for interactive applications is instead consistent responsiveness, i.e., minimizing the number of requests that miss a target latency. This paper is the first to show how to generalize work-stealing, which is traditionally used to minimize the makespan of a single parallel job, to optimize for a target latency in interactive services with multiple parallel requests.

Document Details

Document Type
Pub Defense Publication
Publication Date
Feb 27, 2016
Source ID
10.1145/3016078.2851151

Entities

People

  • Chenyang Lu
  • I-ting Angelina Lee
  • Kathryn S. Mckinley
  • Kunal Agrawal
  • Li Jing
  • Sameh Elnikety
  • Yuxiong He

Organizations

  • Microsoft
  • National Science Foundation
  • Office of Naval Research
  • Washington University in St. Louis

Tags

Fields of Study

  • Computer science

Readers

  • Government Contracting/Procurement.
  • Operations Research
  • Parallel and Distributed Computing.