Fast, Accurate, and Flexible Algorithms for Dense Subtensor Mining
Abstract
Given a large-scale and high-order tensor, how can we detect dense subtensors in it? Can we spot them in near-linear time but with quality guarantees? Extensive previous work has shown that dense subtensors, as well as dense subgraphs, indicate anomalous or fraudulent behavior (e.g., lockstep behavior in social networks). However, available algorithms for detecting dense subtensors are not satisfactory in terms of speed, accuracy, and flexibility. In this work, we propose two algorithms, called M-Z oom and M-B iz , for fast and accurate dense-subtensor detection with various density measures. M-Z oom gives a lower bound on the density of detected subtensors, while M-B iz guarantees the local optimality of detected subtensors. M-Z oom and M-B iz can be combined, giving the following advantages: (1) Scalable: scale near-linearly with all aspects of tensors and are up to 114× faster than state-of-the-art methods with similar accuracy, (2) Provably accurate : provide a guarantee on the lowest density and local optimality of the subtensors they find, (3) Flexible: support multi-subtensor detection and size bounds as well as diverse density measures, and (4) Effective: successfully detected edit wars and bot activities in Wikipedia, and spotted network attacks from a TCP dump with near-perfect accuracy (AUC = 0.98).
Document Details
- Document Type
- Pub Defense Publication
- Publication Date
- Jan 23, 2018
- Source ID
- 10.1145/3154414
Entities
People
- Bryan Hooi
- Christos Faloutsos
- Kijung Shin
Organizations
- Carnegie Mellon University
- National Science Foundation
- United States Army Research Laboratory