Big Data Covariance estimation

Abstract

A wide range of statistical methods, commonly applied in the natural andsocial sciences and in engineering, require an estimate of the relationshipsexisting between many variables, as described by the covariance matrix.These tasks include discovering patterns in unstructured data with principal components analysis, classifying observations with linear and quadraticdiscriminant analysis, modeling a dependency network with probabilisticgraphical models and find an application, for example, in gene arrays, fMRI,text retrieval, image classification, spectroscopy, climate studies, telemetry,finance and macro-economic analysis.For small datasets, where the number of variables is much smaller thanthe sample size, the sample covariance matrix is a natural and efficient tool toestimate a covariance matrix. However, recent technological advances havebrought an explosion of Big Data covariance estimation problems, wherethe matrix that needs to be estimated is quite large compared to the samplesize. In this new setting the sample covariance matrix turns out to beinadequate and alternative statistical methodologies have been and are stillbeing developed. The proposed method outperforms current algorithms according to different optimality/convergence criteria and under different assumptions on the Eigen structure of the population covariance matrix. Furthermore, the method overcomes the problem of a posteriori selecting the rank of the low rank approximation (as this issue is automatically solved within the minimization problem) and at the same times provides a realistic estimate of the sparsitystructure in the residual matrix.

Document Details

Document Type: DoD Grant Award
Publication Date: Feb 06, 2017
Source ID: FA95501710103

Entities

People

Angela Montanari

Organizations

Air Force Office of Scientific Research
United States Air Force

Big Data Covariance estimation

Abstract

Document Details

Entities

People

Organizations

Tags

Readers

Technology Areas