Islands and Bridges: Making Sense of Marked Nodes in Large Graphs

Abstract

Suppose we are given a large graph in which, by some external process, a handful of nodes are marked. What can we say about these marked nodes? Are they all close-by in the graph, or are they segregated into multiple groups? How can we automatically determine how many, if any groups they form as well as find simple paths that connect the nodes in each group? We formalize the problem in terms of the Minimum Description Length principle: a set of paths is simple when we need few bits to describe each path from one node to another. For example, we want to avoid high-degree nodes, unless we need to visit many of its spokes. As such, the best partitioning requires the least number of bits to describe the paths that visit all marked nodes. We show that our formulation for finding simple paths between groups of nodes has connections to well-known other problems in graph theory, and is NP-hard. We propose fast effective solutions, and introduce DOT2DOT, an efficient algorithm for partitioning marked nodes as well as finding simple paths between nodes within parts. Experimentation shows DOT2DOT correctly groups nodes for which good connection paths can be constructed, while separating distant nodes.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Aug 01, 2012
Accession Number
ADA566565

Entities

People

  • Christos Faloutsos
  • Duen H. Chau
  • Hanghang Tong
  • Jilles Vreeken
  • Leman Akoglu

Organizations

  • Carnegie Mellon University

Tags

Communities of Interest

  • Autonomy
  • Biomedical

DTIC Thesaurus Topics

  • Algorithms
  • Anomaly Detection
  • Case Studies
  • Change Detection
  • Coding
  • Computer Networks
  • Computer Science
  • Computers
  • Connectors
  • Data Mining
  • Graph Theory
  • Heuristic Methods
  • Machine Learning
  • Network Science
  • Networks
  • Social Sciences
  • Visualizations

Fields of Study

  • Computer science

Readers

  • Educational Psychology
  • Graph Algorithms and Convex Optimization.