Understanding Transformers, State Space Models and Diffusion Models for Dynamical Systems

Abstract

Compared to dense networks and classical kernel machine methods, deep convolutional neural networks (CNNs), transformers and more recent variants such as state-space models (SSMs) and diffusion models (DMs), have achieved superior performance across various application domains, including highly challenging problems such as protein folding (i.e., AlphaFold3) as well as time series forecasting of dynamical systems. More recently, transformers have also been used as neural operators to forecast the future state of fluid flows and other physical and biological dynamical systems. However, it remains unclear why these architectures work so well. Are there common principles at the core of these successful neural networks. A comparative study at the fundamental level as well as at the performance level will lead to a better understanding of the underlying learning principles. This, in turn, will enable the development of better and more robust architectures, which could benefit a wide range of practical applications of interest to the DoD. We propose to perform this comparison for transformers, space-state models and diffusion models, focusing on three potentially fundamental principles (sparsity, auto-regressive learning, and multi-scale learning). Our main application area is on multi-scale dynamical systems because of the importance of this application and because the domain seems ideal for gathering unique insights in foundational models for forecasting the states of complex multi-scale systems.

Document Details

Document Type
DoD Grant Award
Publication Date
Feb 06, 2025
Source ID
FA95502410231

Entities

People

  • Mengjia Xu

Organizations

  • Air Force Office of Scientific Research
  • New Jersey Institute of Technology
  • United States Air Force

Tags

Fields of Study

  • Computer science

Readers

  • Adaptive Control and Estimation with Uncertainty in Dynamic Systems.
  • Neural Network Machine Learning.
  • Theoretical Analysis.

Technology Areas

  • AI & ML
  • AI & ML - Machine Learning Algorithms
  • AI & ML - Neural Networks
  • Space