Understanding Transformers, State Space Models and Diffusion Models for Dynamical Systems

Abstract

Compared to dense networks and classical kernel machine methods, deep convolutional neural networks (CNNs), transformers and more recent variants such as state-space models (SSMs) and diffusion models (DMs), have achieved superior performance across various application domains, including highly challenging problems such as protein folding (i.e., AlphaFold3) as well as time series forecasting of dynamical systems. More recently, transformers have also been used as neural operators to forecast the future state of fluid flows and other physical and biological dynamical systems. However, it remains unclear why these architectures work so well. Are there common principles at the core of these successful neural networks. A comparative study at the fundamental level as well as at the performance level will lead to a better understanding of the underlying learning principles. This, in turn, will enable the development of better and more robust architectures, which could benefit a wide range of practical applications of interest to the DoD. We propose to perform this comparison for transformers, space-state models and diffusion models, focusing on three potentially fundamental principles (sparsity, auto-regressive learning, and multi-scale learning). Our main application area is on multi-scale dynamical systems because of the importance of this application and because the domain seems ideal for gathering unique insights in foundational models for forecasting the states of complex multi-scale systems.

Document Details

Document Type: DoD Grant Award
Publication Date: Feb 06, 2025
Source ID: FA95502410231

Entities

People

Mengjia Xu

Organizations

Air Force Office of Scientific Research
New Jersey Institute of Technology
United States Air Force

Understanding Transformers, State Space Models and Diffusion Models for Dynamical Systems

Abstract

Document Details

Entities

People

Organizations

Tags

Fields of Study

Readers

Technology Areas