The intersection of machine learning and computational fluid dynamics (CFD) has openes new paths in scientific research, particularly in lattice quantum chromodynamics (QCD). This study focuses on advancing multilevel domain decomposition techniques to increase the scalability and efficiency of lattice QCD computations on exascale supercomputers. Lattice QCD is the pre-eminent ab initio method for solving the fundamental theory of the strong interaction, quantum chromodynamics, in the low-energy regime. Its indispensable role in exploring the precision limit of the Standard Model (SM) makes it a key player in the ongoing search for new physics.

While lattice QCD has been seamlessly integrated into the high performance computing (HPC) infrastructure, challenges arise as the lattice dimensionality increases. The scalability of lattice QCD algorithms lags behind the rapid advances in supercomputing technology. This disparity is due to the predominantly bandwidth-bound nature of lattice QCD applications, which requires algorithms that can effectively exploit the significant ratio between peak floating-point performance and bandwidth in modern HPC clusters equipped with GPUs.

To address these challenges, this study introduces a novel multi-level setup that divides the lattice into frozen and active domains. The active domains, which are updated more frequently during execution, exploit locality to decouple correlations, thereby increasing efficiency. By strategically reusing information from the frozen domains, statistical accuracy is efficiently increased. This innovative approach relies on the development of efficient methods for computing fermionic propagators that exploit the locality properties of the Dirac operator. In addition, the high parallelizability of the setup promises a significant improvement in the scalability of lattice QCD applications.

The primary goal of the study is to assess the production readiness of these multi-level techniques for exascale systems, with a particular focus on their effectiveness in noise reduction. Through a pilot implementation, the project aims to achieve highly efficient lattice QCD simulations. For the first time, a block decomposition of the Dirac operator will be developed and evaluated at physical quark masses, marking a significant breakthrough in the field.

To maximise its impact, the study will take a two-pronged approach. First, an implementation of the required operators/kernels for GPU systems will be provided, using the widely used open source QUDA library for lattice QCD. This approach will ensure wide accessibility and applicability of the developed techniques within the scientific community. Secondly, the project will focus on the development of efficient multigrid preconditioning, a critical component for running simulations at physical quark masses.

This research represents an effort to bridge the gap between lattice QCD algorithms and the computational capabilities of exascale supercomputers. The successful implementation and evaluation of multilevel domain decomposition techniques has the potential to revolutionise the efficiency and scalability of lattice QCD simulations, opening new opportunities in our understanding of the Standard Model.