

Study Results
The project prototyped several compression algorithms for the Vlasiator datasets (for compression and reconstruction), of which two were selected for further assessment and implementation: a Multi-Layer Perceptron (MLP) with Fourier features and an Octree multiresolution scheme based on the Tucker decomposition. The Multi-Layer Perceptron is a neural network approach to compression, exploiting GPU resources efficiently, whereas Octree compression is more efficient on CPUs. Small-scale testing showed that the compression methods could be used to store and reconstruct Vlasiator plasma simulation data at sufficient quality to recover physical simulations from lossy data, and that the methods could succeed in this with better compression than an industry-standard method. The MLP method and its TinyAI engine were optimized to run on both AMD and NVIDIA GPUs hardware in supercomputing environments (LUMI-G), demonstrating simultaneous compression of multi-terabyte data to tens of gigabytes on thousands of GPUs in minutes. but further optimization remains.
Vlasiator [DOI|GitHub], the primary target application for the developed methods, is the state-of- the-art space plasma physics simulation software for Earth’s magnetosphere. The simulation is based on solving the Vlasov equation for plasma ions, providing an unprecedented, noise-free description of the geospace.
Asterix [DOI|GitHub], the main result of this project, is the custom Multi-Layer Perceptron (Tiny AI) for large-scale compression of multidimensional scalar fields. The Asterix library can be deployed both on CPUs and GPUs (with GPUs necessary for optimal performance), on both AMD and NVIDIA hardware.
T-Octree [DOI|GitHub] is the fallback voxel compression method developed by this project for the same purpose. T-Octree is a CPU-only library that uses a multiresolution approach with Tucker decomposition.
Benefits
The ASTERIX library enables good-quality simulation recovery from lossily-compressed data for state-of- the-art Vlasov plasma simulations, providing resilience from node failures and the ability to have frequent checkpoints from which to continue simulations in case of catastrophic failures. The capability to perform more, longer, and larger Vlasov simulations while wasting fewer resources due to node failures will help to understand the effects of space weather and solar storms on modern infrastructures.
The compressed data format increases the amount of exploitable data generated by the simulations by orders of magnitude, mitigating the storage space bottleneck that currently limits the data output that can be used for analysis.
Further, this project serves as a proof of concept for employing AI-based, compact representations instead of memory-intensive, discretized 6D distribution functions, paving the way for novel methods for simulating kinetic plasma physics and other high-dimensional partial differential equations, such as the related Boltzmann equation.
Partners
| University of Helsinki (UH) was the host institution of the ASTERIX project. ASTERIX’ aim was to provide a compression toolset for the Vlasiator space plasma simulation software, developed at UH. UH/Vlasiator is the end user of the project, as well as the domain expert. |
| CSC – IT Center for Science provided software and HPC development support, as well as HPC resources (LUMI and Mahti) used during the project through national quotas |
Team
- M.Sc. Konstantinos Papadakis
- Dr. Markku Alho
- Prof. Minna Palmroth
- Dr. Juhani Kataja
- Dr. Jussi Heikonen
Contact
Name: Markku Alho
Institution: University of Helsinki
Email Address: markku.alho@helsinki.fi
