Qram: A survey and critique
Quantum random-access memory (QRAM) is a mechanism to access data (quantum or
classical) based on addresses which are themselves a quantum state. QRAM has a long …
classical) based on addresses which are themselves a quantum state. QRAM has a long …
A survey of accelerating parallel sparse linear algebra
Sparse linear algebra includes the fundamental and important operations in various large-
scale scientific computing and real-world applications. There exists performance bottleneck …
scale scientific computing and real-world applications. There exists performance bottleneck …
Sparsep: Towards efficient sparse matrix vector multiplication on real processing-in-memory architectures
Several manufacturers have already started to commercialize near-bank Processing-In-
Memory (PIM) architectures, after decades of research efforts. Near-bank PIM architectures …
Memory (PIM) architectures, after decades of research efforts. Near-bank PIM architectures …
A systematic literature survey of sparse matrix-vector multiplication
Sparse matrix-vector multiplication (SpMV) is a crucial computing kernel with widespread
applications in iterative algorithms. Over the past decades, research on SpMV optimization …
applications in iterative algorithms. Over the past decades, research on SpMV optimization …
Towards efficient sparse matrix vector multiplication on real processing-in-memory architectures
Several manufacturers have already started to commercialize near-bank Processing-In-
Memory (PIM) architectures, after decades of research efforts. Near-bank PIM architectures …
Memory (PIM) architectures, after decades of research efforts. Near-bank PIM architectures …
Performance analysis of sparse matrix-vector multiplication (SpMV) on graphics processing units (GPUs)
Graphics processing units (GPUs) have delivered a remarkable performance for a variety of
high performance computing (HPC) applications through massive parallelism. One such …
high performance computing (HPC) applications through massive parallelism. One such …
Sptfs: Sparse tensor format selection for mttkrp via deep learning
Canonical polyadic decomposition (CPD) is one of the most common tensor computations
adopted in many scientific applications. The major bottleneck of CPD is matricized tensor …
adopted in many scientific applications. The major bottleneck of CPD is matricized tensor …
HeteroPP: A directive‐based heterogeneous cooperative parallel programming framework
L Wan, X Cui, Y Li, W Zheng… - … and Computation: Practice …, 2024 - Wiley Online Library
Heterogeneous platforms composed of multiple different types of computing devices (such
as CPUs, GPUs, and Intel MICs) have been widely used recently. However, most of parallel …
as CPUs, GPUs, and Intel MICs) have been widely used recently. However, most of parallel …
Implementation and optimization of SpMV algorithm based on SW26010P many-core processor and stored in BCSR format
M Ma, X Huang, J Xu, D Jia - Scientific Reports, 2024 - nature.com
The irregular distribution of non-zero elements of large-scale sparse matrix leads to low data
access efficiency caused by the unique architecture of the Sunway many-core processor …
access efficiency caused by the unique architecture of the Sunway many-core processor …
ApSpGEMM: Accelerating Large-scale SpGEMM with Heterogeneous Collaboration and Adaptive Panel
The Sparse General Matrix-Matrix multiplication (SpGEMM) is a fundamental component for
many applications, such as algebraic multigrid methods (AMG), graphic processing, and …
many applications, such as algebraic multigrid methods (AMG), graphic processing, and …