Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
The sparse polyhedral framework: Composing compiler-generated inspector-executor code
Irregular applications such as big graph analysis, material simulations, molecular dynamics
simulations, and finite element analysis have performance problems due to their use of …
simulations, and finite element analysis have performance problems due to their use of …
A recursive algebraic coloring technique for hardware-efficient symmetric sparse matrix-vector multiplication
The symmetric sparse matrix-vector multiplication (SymmSpMV) is an important building
block for many numerical linear algebra kernel operations or graph traversal applications …
block for many numerical linear algebra kernel operations or graph traversal applications …
A synchronization-free algorithm for parallel sparse triangular solves
The sparse triangular solve kernel, SpTRSV, is an important building block for a number of
numerical linear algebra routines. Parallelizing SpTRSV on today's manycore platforms …
numerical linear algebra routines. Parallelizing SpTRSV on today's manycore platforms …
Iterative sparse triangular solves for preconditioning
Sparse triangular solvers are typically parallelized using level-scheduling techniques, but
parallel efficiency is poor on high-throughput architectures like GPUs. We propose using an …
parallel efficiency is poor on high-throughput architectures like GPUs. We propose using an …
Enabling and scaling the hpcg benchmark on the newest generation sunway supercomputer with 42 million heterogeneous cores
Q Zhu, H Luo, C Yang, M Ding, W Yin… - Proceedings of the …, 2021 - dl.acm.org
We study and evaluate performance optimization techniques for the HPCG benchmark on
the newest generation Sunway supercomputer. Specifically, a two-level blocking scheme is …
the newest generation Sunway supercomputer. Specifically, a two-level blocking scheme is …
Incomplete sparse approximate inverses for parallel preconditioning
In this paper, we propose a new preconditioning method that can be seen as a
generalization of block-Jacobi methods, or as a simplification of the sparse approximate …
generalization of block-Jacobi methods, or as a simplification of the sparse approximate …
swSpTRSV: A fast sparse triangular solve with sparse level tile layout on sunway architectures
Sparse triangular solve (SpTRSV) is one of the most important kernels in many real-world
applications. Currently, much research on parallel SpTRSV focuses on level-set construction …
applications. Currently, much research on parallel SpTRSV focuses on level-set construction …
Automating wavefront parallelization for sparse matrix computations
This paper presents a compiler and runtime framework for parallelizing sparse matrix
computations that have loop-carried dependences. Our approach automatically generates a …
computations that have loop-carried dependences. Our approach automatically generates a …
Performance optimizations for scalable implicit RANS calculations with SU2
In this paper, we present single-and multi-node optimizations of SU2, a widely-used, open-
source Computational Fluid Dynamics application, aimed at improving performance and …
source Computational Fluid Dynamics application, aimed at improving performance and …
Fast synchronization‐free algorithms for parallel sparse triangular solves with multiple right‐hand sides
The sparse triangular solve kernels, SpTRSV and SpTRSM, are important building blocks for
a number of numerical linear algebra routines. Parallelizing SpTRSV and SpTRSM on …
a number of numerical linear algebra routines. Parallelizing SpTRSV and SpTRSM on …