Optimization techniques for GPU programming

P Hijma, S Heldens, A Sclocco… - ACM Computing …, 2023 - dl.acm.org
In the past decade, Graphics Processing Units have played an important role in the field of
high-performance computing and they still advance new fields such as IoT, autonomous …

[PDF][PDF] The Chinese Wall Security Policy.

DFC Brewer, MJ Nash - S&P, 1989 - facweb.iitkgp.ac.in
Everyone who has seen the movie Wall Street will have seen a commercial security policy in
action. The recent work of Clark and Wilson and the WIPCIS initiative (the Workshop on …

Auto-tuning a high-level language targeted to GPU codes

S Grauer-Gray, L Xu, R Searles… - 2012 innovative …, 2012 - ieeexplore.ieee.org
Determining the best set of optimizations to apply to a kernel to be executed on the graphics
processing unit (GPU) is a challenging problem. There are large sets of possible …

CSR5: An efficient storage format for cross-platform sparse matrix-vector multiplication

W Liu, B Vinter - Proceedings of the 29th ACM on International …, 2015 - dl.acm.org
Sparse matrix-vector multiplication (SpMV) is a fundamental building block for numerous
applications. In this paper, we propose CSR5 (Compressed Sparse Row 5), a new storage …

GPU-accelerated preconditioned iterative linear solvers

R Li, Y Saad - The Journal of Supercomputing, 2013 - Springer
This work is an overview of our preliminary experience in develo** a high-performance
iterative linear solver accelerated by GPU coprocessors. Our goal is to illustrate the …

Efficient sparse matrix-vector multiplication on x86-based many-core processors

X Liu, M Smelyanskiy, E Chow, P Dubey - Proceedings of the 27th …, 2013 - dl.acm.org
Sparse matrix-vector multiplication (SpMV) is an important kernel in many scientific
applications and is known to be memory bandwidth limited. On modern processors with …

A unified sparse matrix data format for efficient general sparse matrix-vector multiplication on modern processors with wide SIMD units

M Kreutzer, G Hager, G Wellein, H Fehske… - SIAM Journal on …, 2014 - SIAM
Sparse matrix-vector multiplication (spMVM) is the most time-consuming kernel in many
numerical algorithms and has been studied extensively on all modern processor and …

Efficient sparse matrix-vector multiplication on GPUs using the CSR storage format

JL Greathouse, M Daga - SC'14: Proceedings of the …, 2014 - ieeexplore.ieee.org
The performance of sparse matrix vector multiplication (SpMV) is important to computational
scientists. Compressed sparse row (CSR) is the most frequently used format to store sparse …

A quantitative performance analysis model for GPU architectures

Y Zhang, JD Owens - 2011 IEEE 17th international symposium …, 2011 - ieeexplore.ieee.org
We develop a microbenchmark-based performance model for NVIDIA GeForce 200-series
GPUs. Our model identifies GPU program bottlenecks and quantitatively analyzes …

Sparsep: Towards efficient sparse matrix vector multiplication on real processing-in-memory architectures

C Giannoula, I Fernandez, JG Luna, N Koziris… - Proceedings of the …, 2022 - dl.acm.org
Several manufacturers have already started to commercialize near-bank Processing-In-
Memory (PIM) architectures, after decades of research efforts. Near-bank PIM architectures …