Information-geometric embeddings for feature extraction in networks
Abstract
Major Goals: The goals of this proposal are to establish the theoretical foundations and develop efficient algorithms to extract features in Feature Induced Networks (FIN). This represents a family of network models where edges are drawn independently conditionally on feature variables on the vertices. The class contains the stochastic block model, a widely used model for community detection, as well as more general models that allow for edge labels and continuous feature space. Our preliminary work shows that for the special case of the stochastic block model, a new information-geometric embedding relying on a f -divergence governs the fundamental limits of feature extraction. This result provides the first necessary and sufficient condition for feature extraction in SBMs. However, the current approach is limited to finite feature spaces and requires knowledge of the model parameters. The goal of this proposal is to establish the theoretical foundations for the extraction of large feature spaces, when the model parameters are unknown. Understanding the fundamental limits of feature extraction will allow us to define a notion of ÒoptimalityÓ for algorithms, and to identify regimes where feature extraction is impossible. The following projects provide a guideline to achieve these goals: ¥ Fundamental limits of feature extraction. What are the properties of the connectivity kernel and the nodesÕ prior which ensure that some of the features can or cannot be extracted from the network topology? Which are those features that can be extracted? Is there an information-geometric embedding of such networks which reveals such fundamental limits? In particular, how can we answer these questions when k scales with n? ¥ Universality and network learning. Do we need to know the model parameters in order to solve the feature extraction problem, or is there a universal approach to feature extraction? When can we learn the model parameters from the network topology? ¥ Efficiency of algorithms. Can we obtain algorithms that have optimal performance guarantees by achieving the limits of feature extraction while being efficient? How does the complexity scale with the feature space dimension; in particular, how can we improve the efficiency of algorithms when focusing on specific subsets of features and/or nodes?
Document Details
- Document Type
- DoD Grant Award
- Publication Date
- Sep 11, 2018
- Source ID
- W911NF1610051
Entities
People
- Emmanuel Abbe
Organizations
- Army Contracting Command
- Princeton University
- United States Army