Randomized Numerical Linear Algebra for Large-Scale Learning and Inference
Abstract
Randomized Numerical Linear Algebra (RandNLA) has proven to be a marriage of linear algebra and probability that provides a foundation for next-generation matrix computation in large-scale machine learning, data analysis, scientificcomputing, signal processing, and related applications. The proposed research is motivated by two complementary long-term goals: first, extend the foundations of RandNLA by tailoring randomization directly towards downstreamend goals provided by the underlying (machine learning, data analysis, signal processing, etc.) problem, rather than intermediate matrix approximations goals; and second, use the statistical and optimization insights obtained from thesedownstream applications to transform and extend the foundations of RandNLA. For both of these related long-term goals, we will take into account additional available structures in the problem, which often includes implicit and/orexplicit statistical effects, real-time and streaming issues, and implicit and/or explicit sparsity considerations. Thus, to achieve these goals, we will focus on three related research challenges: first, exploring scalability-statistical tradeoffs in RandNLA; second, exploring real-time RandNLA theory and practice; and third, exploring RandNLA for improved large-scale optimization. This complements and extends most existing work on RandNLA, which focuseson approximating intractably large matrices with cheaper, usually smaller, randomly projected counterparts. Mathematical analysis of these existing methods has focused on providing theoretical guarantees for the input data(e.g., in theoretical computer science a priori worst-case bounds, or in numerical linear algebra a posteriori bounds), as opposed to making statistical predictions on hypothesized unseen data. Moreover, many practical approaches focus onhighquality solutions, e.g., to machine precision, which is overkill for most signal processing and machine learning tasks. Instead, in our proposed research, we will exploit the fact that the randomness within many RandNLA algorithmscan provide robustness to noise in the data, which then leads to implicit statistical/regularization benefits, and that RandNLA algorithms implicitly engineer sparsity into the algorithm. Indeed, incorporating sparsity into the algorithm can lead to more scalable computations; and cruder approximations may not only be acceptable-- -they may actually be preferred to their more expensive exact counterparts. Thus, this project will particularly focus on statistical aspects of random projection and random sampling algorithms, with a focus on how they are used in machine learning and signal processing applications. While our primary focus will be on establishing a broader mathematical foundation for next generation RandNLA, we will also validate our results on several applications of relevance to the Navy, including sonar,radar, and large-scale climate science. This is a fundamental research project that is not expected to produce any developmental items. Should any developmental items result from this work they will have both civilian and militaryapplications.
Document Details
- Document Type
- DoD Grant Award
- Publication Date
- Jul 26, 2018
- Source ID
- N000141812570
Entities
People
- Michael Mahoney
Organizations
- Office of Naval Research
- United States Navy
- University of California Regents