Highlights

  • Developed FLOWGEN, a novel framework for fast, online training of surrogate models for 3D CFD simulations up to exascale performance
  • Successfully coupled a fully differentiable CFD solver (JAX-Fluids) with deep learning models (FNO and U-Net) using ADIOS2 for high-performance data streaming
  • Demonstrated online training without data storage, reducing memory and I/O bottlenecks typical in large-scale CFD workflows.
  • Benchmarked performance of FNO and U-Net in both offline and online settings, identifying the limitations of auto-regressive predictions in long-term flow simulations
  • Enabled real-time surrogate modeling for applications in automotive design, aerospace optimization, and adaptive manufacturing systems
  • Established a collaborative, cross-border demonstrator involving CERFACS (France) and Friedrich-Alexander-Universität Erlangen-Nürnberg (Germany), combining AI, HPC, and CFD expertise

Figure 1: Online learning framework

Challenge

In the case of online learning, changing data statistics—commonly referred to as concept drift— can bias the model toward recently observed features, as each training sample is only processed once. This dynamic nature of the data distribution impairs the model’s ability to generalize, often resulting in degraded performance compared to offline training.

Standard optimization techniques like stochastic gradient descent (SGD) assume stationary loss landscapes, making them less effective in such evolving scenarios and leading to erratic weight updates. Additional challenges arise due to the necessity of estimating normalization statistics on- the-fly, often resulting in unstable training dynamics. Deep models also suffer from vanishing gradients and feature diminishment when learning sequentially, as backpropagated signals weaken over time without reinforcement. The lack of validation phases further complicates model selection and architecture tuning, making adaptive depth estimation a necessity.

The U-Net and Fourier Neural Operator (FNO) architectures exhibit limitations in capturing long-term flow dynamics, regardless of the training strategy. Particularly under auto-regressive training schemes, where models are trained on their own predictions, error accumulation across time steps often leads to divergence from physically accurate flow fields.

Collectively, these challenges underscore the complexity of deploying surrogate models for CFD tasks in dynamic, real-time environments.


Research Topic

CFD simulations play a pivotal role in modern scientific developments, however they are computationally expensive for complex flow scenarios. Machine-learning based CFD approaches, including surrogate models, promise a faster alternative.

The amount of data produced by CFD simulations to train a surrogate model can easily become massive, leading to challenges on data storage and transfer.

In the FLOWGEN study, we introduced an On-The-Fly Training framework, where a fully differentiable solver is coupled with an ML-based surrogate model, thereby removing the requirement for data storage.


Solution

Adaptive strategies and algorithmic innovations have been proposed to address the multifaceted challenges of our study.

Elastic Weight Consolidation (EWC) preserves knowledge from previously seen data by penalizing updates to critical parameters, thus alleviating catastrophic forgetting.

Replay buffers and reservoir sampling maintain a representative subset of past data, improving stability in non-stationary environments.

Hedge Backpropagation (HBP) is a promising direction to address the challenges of traditional gradient algorithms by adaptively adjusting the contributions of different network depths during training. HBP mitigates the issue of unknown optimal depth by allowing shallow layers to make early predictions while progressively incorporating deeper representations, thereby enhancing robustness against concept drift and reducing reliance on static architecture choices.


Figure 2 : Fourier Neural Operator