Preparing sparse solvers for exascale computing
Sparse solvers provide essential functionality for a wide variety of scientific applications.
Highly parallel sparse solvers are essential for continuing advances in high-fidelity, multi …
Highly parallel sparse solvers are essential for continuing advances in high-fidelity, multi …
Pangulu: A scalable regular two-dimensional block-cyclic sparse direct solver on distributed heterogeneous systems
Sparse direct solvers play a vital role in large-scale high performance computing in science
and engineering. Existing distributed sparse direct methods employ multifrontal/supernodal …
and engineering. Existing distributed sparse direct methods employ multifrontal/supernodal …
Unified Communication Optimization Strategies for Sparse Triangular Solver on CPU and GPU Clusters
This paper presents a unified communication optimization framework for sparse triangular
solve (SpTRSV) algorithms on CPU and GPU clusters. The framework builds upon a 3D …
solve (SpTRSV) algorithms on CPU and GPU clusters. The framework builds upon a 3D …
Harnessing the crowd for autotuning high-performance computing applications
This paper presents GPTuneCrowd, a crowd-based autotuning framework for tuning high-
performance computing applications. GPTuneCrowd collects performance data from various …
performance computing applications. GPTuneCrowd collects performance data from various …
A supernodal all-pairs shortest path algorithm
We show how to exploit graph sparsity in the Floyd-Warshall algorithm for the all-pairs
shortest path (Apsp) problem. Floyd-Warshall is an attractive choice for Apsp on high …
shortest path (Apsp) problem. Floyd-Warshall is an attractive choice for Apsp on high …
GraphFly: Efficient asynchronous streaming graphs processing via dependency-flow
D Chen, C Gui, Y Zhang, H **, L Zheng… - … Conference for High …, 2022 - ieeexplore.ieee.org
Existing streaming graph processing systems typically adopt two phases of refinement and
recomputation to ensure the correctness of the incremental computation. However, severe …
recomputation to ensure the correctness of the incremental computation. However, severe …
Accelerating Large-Scale Sparse LU Factorization for RF Circuit Simulation
Sparse LU factorization is the indispensable building block of the circuit simulation, and
dominates the simulation time, especially when dealing with large-scale circuits. Radio …
dominates the simulation time, especially when dealing with large-scale circuits. Radio …
A distributed-memory algorithm for computing a heavy-weight perfect matching on bipartite graphs
We design and implement an efficient parallel algorithm for finding a perfect matching in a
weighted bipartite graph such that weights on the edges of the matching are large. This …
weighted bipartite graph such that weights on the edges of the matching are large. This …
swSuperLU: A highly scalable sparse direct solver on Sunway manycore architecture
M Tian, J Wang, Z Zhang, W Du, J Pan, T Liu - The Journal of …, 2022 - Springer
Sparse LU factorization is essential for scientific and engineering simulations. In this work,
we present swSuperLU, a highly scalable sparse direct solver on Sunway manycore …
we present swSuperLU, a highly scalable sparse direct solver on Sunway manycore …
A communication-avoiding 3D algorithm for sparse LU factorization on heterogeneous systems
We propose a new algorithm to improve the strong scalability of right-looking sparse LU
factorization on distributed memory systems. Our 3D algorithm for sparse LU uses a three …
factorization on distributed memory systems. Our 3D algorithm for sparse LU uses a three …