The fear of missing out on information for reduced order modeling- A method for healthier data- and model- selection

Abstract

Misleading or unnecessary data can have out-sized impacts on the health or accuracy of Machine Learning(ML) for reduced order models. At the same time, different data may have different importance, especially for dynamical systems with extreme events, i.e. different behavior. We propose a Bayesian sequential selection method, akin to Bayesian experimental design, that identifies critically important information within a dataset while ignoring data that is either misleading or brings unnecessary complexity to the surrogate model of choice. We propose the use of adaptive model selection criteria that probabilistically detect data related to non-trivial (i.e. rare) regimes and give higher emphasis in relevant data points. Specifically, the proposed idea aims to i) significantly accelerate the convergence of the reduced order model with respect to the number of data points employed, an ii) eliminate the phenomena of double descent where more data leads to worse performance.

Document Details

Document Type
DoD Grant Award
Publication Date
Mar 06, 2024
Source ID
FA95502310517

Entities

People

  • Themistoklis Sapsis

Organizations

  • Air Force Office of Scientific Research
  • Massachusetts Institute of Technology
  • United States Air Force

Tags

Fields of Study

  • Computer science

Readers

  • Calculus or Mathematical Analysis
  • Regression Analysis.
  • Systems Analysis and Design

Technology Areas

  • AI & ML
  • AI & ML - Bayesian Inference
  • AI & ML - Neural Networks