Fast, Accurate, and Flexible Algorithms for Dense Subtensor Mining

Abstract

Given a large-scale and high-order tensor, how can we detect dense subtensors in it? Can we spot them in near-linear time but with quality guarantees? Extensive previous work has shown that dense subtensors, as well as dense subgraphs, indicate anomalous or fraudulent behavior (e.g., lockstep behavior in social networks). However, available algorithms for detecting dense subtensors are not satisfactory in terms of speed, accuracy, and flexibility. In this work, we propose two algorithms, called M-Z oom and M-B iz , for fast and accurate dense-subtensor detection with various density measures. M-Z oom gives a lower bound on the density of detected subtensors, while M-B iz guarantees the local optimality of detected subtensors. M-Z oom and M-B iz can be combined, giving the following advantages: (1) Scalable: scale near-linearly with all aspects of tensors and are up to 114× faster than state-of-the-art methods with similar accuracy, (2) Provably accurate : provide a guarantee on the lowest density and local optimality of the subtensors they find, (3) Flexible: support multi-subtensor detection and size bounds as well as diverse density measures, and (4) Effective: successfully detected edit wars and bot activities in Wikipedia, and spotted network attacks from a TCP dump with near-perfect accuracy (AUC = 0.98).

Document Details

Document Type
Pub Defense Publication
Publication Date
Jan 23, 2018
Source ID
10.1145/3154414

Entities

People

  • Bryan Hooi
  • Christos Faloutsos
  • Kijung Shin

Organizations

  • Carnegie Mellon University
  • National Science Foundation
  • United States Army Research Laboratory

Tags

Fields of Study

  • Computer science

Readers

  • Information Retrieval
  • Parallel and Distributed Computing.
  • Sensor Fusion and Tracking Systems.