Combinatorial Statistical Inference with Mathematical Optimization
Abstract
The proliferation of data from complex heterogeneous sources in varying shapes, forms and sizes, have led to a burning need for principled tools in statistical inference to assimilate, understand and extract meaningful information from data. The current proposal aims to make new methodological contributions towards a better statistical and computational understanding ofsome key inferential tasks that are inherently combinatorial in nature. We address problems in: high dimensional regression using new techniques in computational discrete optimization, sparse modeling in the face of data uncertainty; richly structured combinatorial models; nonparametricfunction estimation with non-convex shape constraints; and large scale computational algorithms relying on techniques from statistical dimensionality reduction and mathematical optimization. Conducting rigorous combinatorial statistical inference on these problems with appropriate practical computational tools is widely acknowledged as challenging. Indeed, the set ofdisciplined tools available in a data scientist???s toolkit for these tasks is seriously limited; and enriching them via new perspectives is a main goal of our proposal.We present a new paradigm to address these problems using foundational principles spanning the fields of statistics and mathematical optimization (convex, robust and mixed integer optimization) leading to a multidisciplinary approach that is more powerful than a singular methodological perspective. An important thrust of our proposed research lies in leveraging the astonishing computational advances in mixed integer optimization over the past 10 years in order to create new principled methods in statistical computing. Indeed, a systematic use of modern integer optimization methods in computational statistics seems to be in its infancy, especially when compared to the significant impact made by convex optimization methods in the wider statistics fraternity. Our vision is to significantly advance the state-of-the-art in performingrigorous statistical inference with discrete statistical objects and create new methods essential for the modern-day data scientist. If successful, we will create new methods appropriate for off-line combinatorial inferential tasks involving datasets much larger than what is currently considered tractable. Our framework will provide quantifiable certificates of optimality for large-scale,structured nonconvex optimization methods associated with these learning tasks. PhD students will be involved in the research work. New techniques and methods developed, as a part of this research will be included in graduate school curricula to train the next generation of scientists, engineers and statisticians. The Office of Naval Research is interested in creating new methods that involves processing large and small datasets of different types (image, text, documents, networks, etc) in times relevant for the application. A mission of ONR is to advance basic scientific research at the interface of mathematics, statistics, machine learning and related disciplines so that tools created can better inform complex decision-making processes, especially under uncertainty. The currentproposal will study fundamental methods that play a crucial role in analyzing data from varied sources. Our research is poised to propose a fresh new perspective into many problems that are usually approached using computationally friendlier, often ad-hoc alternatives. This will addressthe Navy and Department of Defense needs in statistical computation and modeling that play a crucial role in enabling automatic, robust, accurate and rapid decision-making.
Document Details
- Document Type
- DoD Grant Award
- Publication Date
- Jul 10, 2018
- Source ID
- N000141812298
Entities
People
- Rahul Mazumder
Organizations
- Massachusetts Institute of Technology
- Office of Naval Research
- United States Navy