The fear of missing out on information for reduced order modeling- A method for healthier data- and model- selection
Abstract
Misleading or unnecessary data can have out-sized impacts on the health or accuracy of Machine Learning(ML) for reduced order models. At the same time, different data may have different importance, especially for dynamical systems with extreme events, i.e. different behavior. We propose a Bayesian sequential selection method, akin to Bayesian experimental design, that identifies critically important information within a dataset while ignoring data that is either misleading or brings unnecessary complexity to the surrogate model of choice. We propose the use of adaptive model selection criteria that probabilistically detect data related to non-trivial (i.e. rare) regimes and give higher emphasis in relevant data points. Specifically, the proposed idea aims to i) significantly accelerate the convergence of the reduced order model with respect to the number of data points employed, an ii) eliminate the phenomena of double descent where more data leads to worse performance.
Document Details
- Document Type
- DoD Grant Award
- Publication Date
- Mar 06, 2024
- Source ID
- FA95502310517
Entities
People
- Themistoklis Sapsis
Organizations
- Air Force Office of Scientific Research
- Massachusetts Institute of Technology
- United States Air Force