Study Results

NeuralPinT successfully delivered a software stack that couples the pySDC PinT library, Dedalus spectral methods framework, and the pyTorch software for machine learning. Additionally, a stand-alone GPU numerical solver was also implemented in order to utilize GPUs for the numerical solver component. A special physics-informed variant of the FNO, called CFNO for Chebyshev Fourier Neural Operator, was developed to deliver accurate solutions of the RBC-equations. Benchmarks for Rayleigh Benard convection on the JURECA system demonstrated that parallelization-in-time through parallel SDC delivered a reduction of time-to-solution by a factor of two. Despite the accuracy of the FNO, issues arising from noise prevented it from accelerating convergence of the parallel SDC method so that it did not deliver additional speedup.

The algorithms we used for the numerical solvers were parallel SDC for time integration, spectral methods for spatial discretization, and Fourier Neural Operators for the machine learning components. In terms of software, we used the pySDC code to provide the parallel SDC algorithm, Dedalus to provide spectral spatial discretization and parallelization, and pyTorch as the framework to implement the Fourier Neural Operator. Our standalone GPU implementation of the numerical part used the same spectral method approach as Dedalus but implemented in the CuPy Python library to utilize GPUs.


Benefits

Rayleigh-Benard convection is a highly relevant benchmark for various applications in geophysical fluid dynamics applications, for example climate simulations or simulations of Earth’s dynamo. Such simulations struggle with long time-to-solution because of the extremely challenging nature of the studied systems. Effective parallelization-in-time (PinT) could eventually be integrated into domain science codes to increase scaling on HPC systems and reduce runtimes. NeuralPinT provides software, algorithms, and results that domain scientists can use and build on to incorporate PinT into their operational models.


Partners

Hamburg University of Technology (TUHH). Having been founded in 1978, TUHH is one of Germany’s youngest technical universities. TUHH is following a path of combining internationally recognised, interdisciplinary research and strong collaborations with industry and other stakeholders with a focus on “Engineering to face climate change”.
Forschungszentrum Jülich GmbH (FZJ) is one of Europe’s largest interdisciplinary research centres and focuses on advanced technology and research in various disciplines. As a member of the Helmholtz Association, FZJ is part of Germany’s largest scientific organisation. The Jülich Supercomputing Centre is part of FZJ and currently hosts JUWELS, Europe’s fourth-fastest supercomputer (Top500 list of May 2023), which includes almost 4,000 Nvidia GPUs. JUPITER, Europe’s first Exascale computer, is currently under development and is scheduled to be installed at JSC in 2024.

Team

  • Thibaut Lunet
  • Sebastian Götschel
  • Daniel Ruprecht
  • Chelsea John
  • Andreas Herten
  • Stefan Kesselheim
  • Thomas Baumann

Contact

Name: Prof Daniel Ruprecht

Institution: Hamburg University of Technology (TUHH)

Email Address: ruprecht@tuhh.de