Graph Based Models for Unsupervised High Dimensional Data Clustering and Network Analysis

Abstract

The demand for analyzing patterns and structures of data is growing dramatically in recent years. The study of network structure is pervasive in sociology, biology computer science, and many other disciplines. My research focuses on network and high-dimensional data analysis, using graph based models and tools from sparse optimization. The specific question about networks we are studying is "clustering": partitioning a network into cohesive groups. Depending on the contexts, these tightly connected groups can be social units, functional modules or components of an image. My work consists of both theoretical analysis and numerical simulation. We first analyze some social network and image datasets using a quality function called "modularity", which is a popular model for clustering in network science. Then we further study the modularity function from a novel perspective: with my collaborators we reformulate modularity optimization as a minimization problem of an energy functional that consists of a total variation term and an L2 balance term. By employing numerical techniques from image processing and L1 compressive sensing, such as the Merriman-Bence-Osher (MBO) scheme, we develop a variational algorithm for the minimization problem. Along a similar line of research, we work on a multi-class segmentation problem using the piecewise constant Mumford-Shah model in a graph setting. We propose an efficient algorithm for the graph version of Mumford-Shah model using the MBO scheme. Theoretical analysis is developed and a Lyapunov functional is proven to decrease as the algorithm proceeds. Furthermore, to reduce the computational cost for large datasets, we incorporate the Nystr from extension method to efficiently approximates eigenvectors of the graph Laplacian based on a small portion of the weight matrix.

Open PDF

Document Details

Document Type
Technical Report
Publication Date
Jan 01, 2015
Accession Number
ADA626218

Entities

People

  • Huiyi Hu

Organizations

  • University of California, Los Angeles

Tags

Communities of Interest

  • Energy and Power Technologies

DTIC Thesaurus Topics

  • Algorithms
  • Compressed Sensing
  • Computer Science
  • Computer Vision
  • Computers
  • Data Mining
  • Detection
  • Differential Equations
  • Dimensionality Reduction
  • Eigenvectors
  • Image Processing
  • Image Segmentation
  • Mathematics
  • Network Science
  • Partial Differential Equations
  • Signal Processing
  • Social Networks

Fields of Study

  • Computer science

Readers

  • Graph Algorithms and Convex Optimization.
  • Neural Network Machine Learning.
  • Theoretical Analysis.