Efficiently Estimating Motif Statistics of Large Networks

Abstract

Exploring statistics of locally connected subgraph patterns (also known as network motifs) has helped researchers better understand the structure and function of biological and Online Social Networks (OSNs). Nowadays, the massive size of some critical networks—often stored in already overloaded relational databases—effectively limits the rate at which nodes and edges can be explored, making it a challenge to accurately discover subgraph statistics. In this work, we propose sampling methods to accurately estimate subgraph statistics from as few queried nodes as possible. We present sampling algorithms that efficiently and accurately estimate subgraph properties of massive networks. Our algorithms require no precomputation or complete network topology information. At the same time, we provide theoretical guarantees of convergence. We perform experiments using widely known datasets and show that, for the same accuracy, our algorithms require an order of magnitude less queries (samples) than the current state-of-the-art algorithms.

Document Details

Document Type
Pub Defense Publication
Publication Date
Sep 23, 2014
Source ID
10.1145/2629564

Entities

People

  • Bruno Ribeiro
  • Don Towsley
  • John C. S. Lui
  • Junzhou Zhao
  • Pinghui Wang
  • Xiaohong Guan

Organizations

  • Army Research Office
  • Carnegie Mellon University
  • Division of Computer and Network Systems
  • Huawei
  • Ministry of Education of the People's Republic of China
  • Ministry of Science and Technology of the People's Republic of China
  • National Natural Science Foundation of China
  • The Chinese University of Hong Kong
  • United States Army Research Laboratory
  • University of Massachusetts
  • Xi'an Jiaotong University

Tags

Fields of Study

  • Computer science

Readers

  • Graph Algorithms and Convex Optimization.
  • Parallel and Distributed Computing.
  • Regression Analysis.