Survey on grid resource allocation mechanisms

MB Qureshi, MM Dehnavi, N Min-Allah… - Journal of Grid …, 2014 - Springer
Grid is a distributed high performance computing paradigm that offers various types of
resources (like computing, storage, communication) to resource-intensive user tasks. These …

Optimizing CUDA code by kernel fusion: application on BLAS

J Filipovič, M Madzin, J Fousek, L Matyska - The Journal of …, 2015 - Springer
Contemporary GPUs have significantly higher arithmetic throughput than a memory
throughput. Hence, many GPU kernels are memory bound and cannot exploit arithmetic …

Runtime composition of iterations for fusing loop-carried sparse dependence

K Cheshmi, M Strout, M Mehri Dehnavi - Proceedings of the International …, 2023 - dl.acm.org
Dependence between iterations in sparse computations causes inefficient use of memory
and computation resources. This paper proposes sparse fusion, a technique that generates …

Fine-grained GPU implementation of assembly-free iterative solver for finite element problems

J Martínez-Frutos, PJ Martínez-Castejón… - Computers & …, 2015 - Elsevier
This paper proposes a fine-grained implementation of matrix-free Conjugate Gradient (CG)
solver for Finite Element Analysis (FEA) using Graphics Processing Unit (GPU) …

Parallel L-BFGS-B algorithm on gpu

Y Fei, G Rong, B Wang, W Wang - Computers & graphics, 2014 - Elsevier
Due to the rapid advance of general-purpose graphics processing unit (GPU), it is an active
research topic to study performance improvement of non-linear optimization with parallel …

Efficient matrix-free GPU implementation of fixed grid finite element analysis

J Martínez-Frutos, D Herrero-Pérez - Finite Elements in Analysis and …, 2015 - Elsevier
This paper proposes a strategy for the efficient implementation of Fixed Grid Finite Element
Analysis (FGFEA) method on Graphics Processing Units (GPUs). Such a strategy makes use …

Enhancing data locality of the conjugate gradient method for high-order matrix-free finite-element implementations

M Kronbichler, D Sashko… - The International Journal …, 2023 - journals.sagepub.com
This work investigates a variant of the conjugate gradient (CG) method and embeds it into
the context of high-order finite-element schemes with fast matrix-free operator evaluation …

Parallel sparse approximate inverse preconditioning on graphic processing units

MM Dehnavi, DM Fernandez, JL Gaudiot… - IEEE transactions on …, 2012 - ieeexplore.ieee.org
Accelerating numerical algorithms for solving sparse linear systems on parallel architectures
has attracted the attention of many researchers due to their applicability to many …

Matrix-free GPU implementation of a preconditioned conjugate gradient solver for anisotropic elliptic PDEs

E Müller, X Guo, R Scheichl, S Shi - Computing and Visualization in …, 2013 - Springer
Many problems in geophysical and atmospheric modelling require the fast solution of elliptic
partial differential equations (PDEs) in “flat” three dimensional geometries. In particular, an …

Alternate parallel processing approach for FEM

DM Fernandez, MM Dehnavi, WJ Gross… - IEEE Transactions …, 2012 - ieeexplore.ieee.org
In this work we present a new alternate way to formulate the finite element method (FEM) for
parallel processing based on the solution of single mesh elements called FEM-SES. The key …