New Streaming Algorithms for Fast Detection of Superspreaders

Abstract

High-speed monitoring of Internet traffic is an important and challenging problem, with applications to real-time attack detection and mitigation, traffic engineering, etc. However, packet-level monitoring requires fast streaming algorithms that use very little memory space and little communication among collaborating network monitoring points. In this paper, we consider the problem of detecting superspreaders, which are sources that connect to a large number of distinct destinations. We propose several new streaming algorithms for detecting superspreaders, and prove guarantees on their accuracy and memory requirements. We also show experimental results on real network traces. Our algorithms are substantially more efficient (both theoretically and experimentally) than previous approaches. We also provide several extensions to our algorithms -- we show how to identify superspreaders in a distributed setting, with sliding windows, and when deletions are allowed in the stream. More generally, our algorithms are applicable to any problem that can be formulated as follows: given a stream of (x,y) pairs, find all the x's that are paired with a large number of distinct y's. We call this the heavy distinct-hitters problem. There are many network security applications of this general problem. This paper discusses these and other applications, and for concreteness, focuses on the superspreader problem.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
May 01, 2004
Accession Number
ADA461026

Entities

People

  • Avrim Blum
  • Dawn Song
  • Phillip B. Gibbons
  • Shobha Venkataraman

Organizations

  • Carnegie Mellon University

Tags

Communities of Interest

  • Cyber

DTIC Thesaurus Topics

  • Accuracy
  • Algorithms
  • Change Detection
  • Computer Network Security
  • Computer Science
  • Computers
  • Databases
  • Detection
  • Hash Tables
  • Information Operations
  • Intrusion Detection
  • Lists (Data Structures)
  • Measurement
  • Network Protocols
  • Port Scanners
  • Precision
  • Probability

Fields of Study

  • Computer science

Readers

  • Computer Networking
  • Distributed Systems and Data Platform Development
  • Operations Research

Technology Areas

  • Cyber
  • Space