Multi-GPU thermal lattice Boltzmann simulations using OpenACC and MPI

A Xu, BT Li - International Journal of Heat and Mass Transfer, 2023 - Elsevier
We assess the performance of the hybrid Open Accelerator (OpenACC) and Message
Passing Interface (MPI) approach for multi-graphics processing units (GPUs) accelerated …

Particle-resolved thermal lattice Boltzmann simulation using OpenACC on multi-GPUs

A Xu, BT Li - International Journal of Heat and Mass Transfer, 2024 - Elsevier
Abstract We utilize the Open Accelerator (OpenACC) approach for graphics processing unit
(GPU) accelerated particle-resolved thermal lattice Boltzmann (LB) simulation. We adopt the …

Massively parallel lattice–Boltzmann codes on large GPU clusters

E Calore, A Gabbana, J Kraus, E Pellegrini… - Parallel Computing, 2016 - Elsevier
This paper describes a massively parallel code for a state-of-the art thermal lattice–
Boltzmann method. Our code has been carefully optimized for performance on one GPU and …

Beyond moments: relativistic lattice Boltzmann methods for radiative transport in computational astrophysics

LR Weih, A Gabbana, D Simeoni… - Monthly Notices of …, 2020 - academic.oup.com
We present a new method for the numerical solution of the radiative-transfer equation (RTE)
in multidimensional scenarios commonly encountered in computational astrophysics. The …

Evaluation of DVFS techniques on modern HPC processors and accelerators for energy‐aware applications

E Calore, A Gabbana, SF Schifano… - Concurrency and …, 2017 - Wiley Online Library
Energy efficiency is becoming increasingly important for computing systems, in particular for
large scale High Performance Computing (HPC) facilities. In this work, we evaluate, from a …

Performance and power analysis of hpc workloads on heterogeneous multi-node clusters

F Mantovani, E Calore - Journal of Low Power Electronics and …, 2018 - mdpi.com
Performance analysis tools allow application developers to identify and characterize the
inefficiencies that cause performance degradation in their codes, allowing for application …

Fast kinetic simulator for relativistic matter

VE Ambruş, L Bazzanini, A Gabbana… - Nature Computational …, 2022 - nature.com
Relativistic kinetic theory is ubiquitous to several fields of modern physics, finding
application at large scales in systems in astrophysical contexts, all of the way down to …

Optimization of lattice Boltzmann simulations on heterogeneous computers

E Calore, A Gabbana, SF Schifano… - … Journal of High …, 2019 - journals.sagepub.com
High-performance computing systems are more and more often based on accelerators.
Computing applications targeting those systems often follow a host-driven approach, in …

ThunderX2 performance and energy-efficiency for HPC workloads

E Calore, A Gabbana, SF Schifano, R Tripiccione - Computation, 2020 - mdpi.com
In the last years, the energy efficiency of HPC systems is increasingly becoming of
paramount importance for environmental, technical, and economical reasons. Several …

Performance portability study for massively parallel computational fluid dynamics application on scalable heterogeneous architectures

S Lee, J Gounley, A Randles, JS Vetter - Journal of Parallel and Distributed …, 2019 - Elsevier
Patient-specific hemodynamic simulations have the potential to greatly improve both the
diagnosis and treatment of a variety of vascular diseases. Portability will enable wider …