Robust Feature Augmentation and Uncertain Quantification in Data Science
Abstract
Naval missions depend critically on the ability to process, analyze, interpret and reason about complex and high-dimensional big dat,a in highly stochastic networked environments, and the capabilities of performing reliable reasoning in the presence of delayed feed,backs and noisy environments. Complex and high-dimensional data are always accompanied by heavy tails and dependence among measurem,ents. The associated statistical machine learning procedures are vulnerable to adversary attacks. It is therefore critically impor,tant to develop robust statistical machine learning methods with great prediction power and uncertainty quantification. This propo,sal aims at developing robust feature augmentations and uncertainty quantifications that take into account the stylized features of,high-dimensional data. The proposal has three interrelated aims. The first aim is to extract robust features from complex high-dim,ensional big data for statistical learning and prediction. It plans to create features via latent factors, principal components, an,d marginal likelihood ratios, and augment these features in statistical learning and prediction. The methodological power will be d,emonstrated via commonly used machine learning test datasets and will be analyzed via statistical machine learning theory with empha,sis on the feature-based learning models and deep neural networks. The second aim is to develop robust methods for uncertainty quan,tifications that takes into account of the dependence among high-dimensional measurements. The focus is on developing robust method,s for learning latent factors and use them for downstream uncertainty quantifications. In particular, the proposal plans to use fac,tor adjustments to construct confidence intervals for linear models and generalized linear models, address the model adequacy when o,nly using the original features, and control false discovery rates by creating factor-adjusted knockoff controls. The third aim is,to develop robust high-dimensional inferences for low-rank matrices. We plan to study robust noisy matrix completion, noise phase r,etrieval, and uncertainty quantifications based on heteroscedastic PCA. This project will develop critical and robust methods for,big data analytics and address several fundamental problems in statistical machine learning. The proposed research will have big im,plications in a wide range of Naval applications where it is desirable to reason about critical information content and extract acti,onable intelligence from limited data sources in complex, unstructured, and dynamically changing environments. These applications in,clude autonomous systems, robust dynamic data analysis and prediction, robot perception, video surveillance, environment monitoring,, real-time decision making and uncertainty assessments.
Document Details
- Document Type
- DoD Grant Award
- Publication Date
- Apr 01, 2022
- Source ID
- N000142212340
Entities
People
- Jianqing Fan
Organizations
- Office of Naval Research
- Trustees of Princeton University
- United States Navy