Survey on grid resource allocation mechanisms
Grid is a distributed high performance computing paradigm that offers various types of
resources (like computing, storage, communication) to resource-intensive user tasks. These …
resources (like computing, storage, communication) to resource-intensive user tasks. These …
Optimizing CUDA code by kernel fusion: application on BLAS
Contemporary GPUs have significantly higher arithmetic throughput than a memory
throughput. Hence, many GPU kernels are memory bound and cannot exploit arithmetic …
throughput. Hence, many GPU kernels are memory bound and cannot exploit arithmetic …
Runtime composition of iterations for fusing loop-carried sparse dependence
Dependence between iterations in sparse computations causes inefficient use of memory
and computation resources. This paper proposes sparse fusion, a technique that generates …
and computation resources. This paper proposes sparse fusion, a technique that generates …
Fine-grained GPU implementation of assembly-free iterative solver for finite element problems
J Martínez-Frutos, PJ Martínez-Castejón… - Computers & …, 2015 - Elsevier
This paper proposes a fine-grained implementation of matrix-free Conjugate Gradient (CG)
solver for Finite Element Analysis (FEA) using Graphics Processing Unit (GPU) …
solver for Finite Element Analysis (FEA) using Graphics Processing Unit (GPU) …
Parallel L-BFGS-B algorithm on gpu
Due to the rapid advance of general-purpose graphics processing unit (GPU), it is an active
research topic to study performance improvement of non-linear optimization with parallel …
research topic to study performance improvement of non-linear optimization with parallel …
Efficient matrix-free GPU implementation of fixed grid finite element analysis
This paper proposes a strategy for the efficient implementation of Fixed Grid Finite Element
Analysis (FGFEA) method on Graphics Processing Units (GPUs). Such a strategy makes use …
Analysis (FGFEA) method on Graphics Processing Units (GPUs). Such a strategy makes use …
Enhancing data locality of the conjugate gradient method for high-order matrix-free finite-element implementations
This work investigates a variant of the conjugate gradient (CG) method and embeds it into
the context of high-order finite-element schemes with fast matrix-free operator evaluation …
the context of high-order finite-element schemes with fast matrix-free operator evaluation …
Parallel sparse approximate inverse preconditioning on graphic processing units
Accelerating numerical algorithms for solving sparse linear systems on parallel architectures
has attracted the attention of many researchers due to their applicability to many …
has attracted the attention of many researchers due to their applicability to many …
Matrix-free GPU implementation of a preconditioned conjugate gradient solver for anisotropic elliptic PDEs
Many problems in geophysical and atmospheric modelling require the fast solution of elliptic
partial differential equations (PDEs) in “flat” three dimensional geometries. In particular, an …
partial differential equations (PDEs) in “flat” three dimensional geometries. In particular, an …
Alternate parallel processing approach for FEM
In this work we present a new alternate way to formulate the finite element method (FEM) for
parallel processing based on the solution of single mesh elements called FEM-SES. The key …
parallel processing based on the solution of single mesh elements called FEM-SES. The key …