Large-scale Heterogeneous Network Data Analysis

Abstract

Large-scale network is a powerful data structure allowing the depiction of relationship information between entities. An unsupervised tensor-based mechanism was proposed, considering higher-order relational information, to model the complex semantics of nodes. The signature profiles are derived as a vector-based representation to enable further mining algorithms. Based on this model, solutions to tackle three critical issues in heterogeneous networks are presented. First, different aspects of central individuals are identified through three proposed measures, including contribution-based, diversity-based, and similarity-based centrality. Second, a role-based clustering method was proposed to identify nodes playing similar roles in the network. Third, to facilitate further explorations and visualization in a complex network data, the egocentric information abstraction was devised and three abstraction criteria was proposed to distill representative and significant information with respect to any given node. The evaluations are conducted on a real-world movie dataset, and an artificial crime dataset. The proposed centralities and role-based clustering can indeed find some meaningful results. The effectiveness of the egocentric abstraction is shown by providing more accurate, efficient, and confidential crime detection for human subjects.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jul 31, 2012
Accession Number
ADA566334

Entities

People

  • Shou-De Lin

Organizations

  • National Taiwan University

Tags

Communities of Interest

  • Human Systems

DTIC Thesaurus Topics

  • Accuracy
  • Algorithms
  • Clustering
  • Data Analysis
  • Data Mining
  • Detection
  • Electronic Mail
  • Equations
  • Heterogeneous Networks
  • Networks
  • Pattern Recognition
  • Probability
  • Random Variables
  • Semantics
  • Social Networks
  • Test And Evaluation
  • Visualizations

Fields of Study

  • Computer science

Readers

  • Computer Vision.
  • Distributed Systems and Data Platform Development
  • Neural Network Machine Learning.