Work stealing for interactive services to meet target latency

Abstract

Interactive web services increasingly drive critical business workloads such as search, advertising, games, shopping, and finance. Whereas optimizing parallel programs and distributed server systems have historically focused on average latency and throughput, the primary metric for interactive applications is instead consistent responsiveness, i.e., minimizing the number of requests that miss a target latency. This paper is the first to show how to generalize work-stealing, which is traditionally used to minimize the makespan of a single parallel job, to optimize for a target latency in interactive services with multiple parallel requests.

Document Details

Document Type: Pub Defense Publication
Publication Date: Feb 27, 2016
Source ID: 10.1145/3016078.2851151

Entities

People

Chenyang Lu
I-ting Angelina Lee
Kathryn S. Mckinley
Kunal Agrawal
Li Jing
Sameh Elnikety
Yuxiong He

Organizations

Microsoft
National Science Foundation
Office of Naval Research
Washington University in St. Louis

Work stealing for interactive services to meet target latency

Abstract

Document Details

Entities

People

Organizations

Tags

Fields of Study

Readers