Understanding TCP Incast and Its Implications for Big Data Workloads

Abstract

TCP incast is a recently identified network transport pathology that affects many-to-one communication patterns in datacenters. It is caused by a complex interplay between datacenter applications, the underlying switches, network topology, and TCP, which was originally designed for wide area networks. Incast increases the queuing delay of flows, and decreases application level throughput to far below the link bandwidth. The problem especially affects computing paradigms in which distributed processing cannot progress until all parallel threads in a stage complete. Examples of such paradigms include distributed file systems, web search, advertisement selection, and other applications with partition or aggregation semantics [5, 18, 25].

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Apr 06, 2012
Accession Number
ADA561775

Entities

People

  • David Zats
  • Randy H. Katz
  • Rean Griffit
  • Yanpei Chen

Organizations

  • University of California, Berkeley

Tags

Communities of Interest

  • Materials and Manufacturing Processes

DTIC Thesaurus Topics

  • Big Data
  • California
  • Collapse
  • Commerce
  • Computations
  • Computer Science
  • Data Analysis
  • Electrical Engineering
  • Electronic Commerce
  • Fault Tolerance
  • Flow Rate
  • Measurement
  • Network Topology
  • Networks
  • Standards
  • Transport Protocols
  • Workload

Fields of Study

  • Computer science

Readers

  • Computer Networking
  • Distributed Systems and Data Platform Development
  • Theoretical Analysis.