Optimization techniques for GPU programming

P Hijma, S Heldens, A Sclocco… - ACM Computing …, 2023 - dl.acm.org
In the past decade, Graphics Processing Units have played an important role in the field of
high-performance computing and they still advance new fields such as IoT, autonomous …

Adaptive cache management for energy-efficient GPU computing

X Chen, LW Chang, CI Rodrigues, J Lv… - 2014 47th Annual …, 2014 - ieeexplore.ieee.org
With the SIMT execution model, GPUs can hide memory latency through massive
multithreading for many applications that have regular memory access patterns. To support …

Mpcgpu: Real-time nonlinear model predictive control through preconditioned conjugate gradient on the gpu

E Adabag, M Atal, W Gerard… - 2024 IEEE International …, 2024 - ieeexplore.ieee.org
Nonlinear Model Predictive Control (NMPC) is a state-of-the-art approach for locomotion
and manipulation which leverages trajectory optimization at each control step. While the …

A comparison of binarization methods for historical archive documents

J He, QDM Do, AC Downton… - … Conference on Document …, 2005 - ieeexplore.ieee.org
This paper compares several alternative binarization algorithms for historical archive
documents, by evaluating their effect on end-to-end word recognition performance in a …

Enabling and scaling a global shallow-water atmospheric model on Tianhe-2

W Xue, C Yang, H Fu, X Wang, Y Xu… - 2014 IEEE 28th …, 2014 - ieeexplore.ieee.org
This paper presents a hybrid algorithm for the petascale global simulation of atmospheric
dynamics on Tianhe-2, the world's current top-ranked supercomputer developed by China's …

A cell-centered implicit finite difference scheme to study wave propagation in acoustic media: A numerical modeling

S Kumawat, A Malkoti, SK Vishwakarma - Journal of Sound and Vibration, 2024 - Elsevier
In the present paper, we present a Cell-Centered Implicit Finite Difference (CCIFD) operator-
based numerical scheme for the propagation of acoustic waves that is very effective …

Architecture-based design and optimization of genetic algorithms on multi-and many-core systems

L Zheng, Y Lu, M Guo, S Guo, CZ Xu - Future Generation Computer …, 2014 - Elsevier
Abstract A Genetic Algorithm (GA) is a heuristic to find exact or approximate solutions to
optimization and search problems within an acceptable time. We discuss GAs from an …

A GPU-accelerated semi-implicit fractional-step method for numerical solutions of incompressible Navier–Stokes equations

S Ha, J Park, D You - Journal of Computational Physics, 2018 - Elsevier
Utility of the computational power of Graphics Processing Units (GPUs) is elaborated for
solutions of incompressible Navier–Stokes equations which are integrated using a semi …

Manycore algorithms for batch scalar and block tridiagonal solvers

E Laszlo, M Giles, J Appleyard - ACM Transactions on Mathematical …, 2016 - dl.acm.org
Engineering, scientific, and financial applications often require the simultaneous solution of
a large number of independent tridiagonal systems of equations with varying coefficients …

Solving large problem sizes of index-digit algorithms on GPU: FFT and tridiagonal system solvers

AP Diéguez, M Amor, J Lobeiras… - IEEE Transactions on …, 2017 - ieeexplore.ieee.org
Current Graphics Processing Units (GPUs) are capable of obtaining high computational
performance in scientific applications. Nevertheless, programmers have to use suitable …