A systematic survey of general sparse matrix-matrix multiplication

J Gao, W Ji, F Chang, S Han, B Wei, Z Liu… - ACM Computing …, 2023 - dl.acm.org
General Sparse Matrix-Matrix Multiplication (SpGEMM) has attracted much attention from
researchers in graph analyzing, scientific computing, and deep learning. Many optimization …

TileSpGEMM: A tiled algorithm for parallel sparse general matrix-matrix multiplication on GPUs

Y Niu, Z Lu, H Ji, S Song, Z **, W Liu - Proceedings of the 27th ACM …, 2022 - dl.acm.org
Sparse general matrix-matrix multiplication (SpGEMM) is one of the most fundamental
building blocks in sparse linear solvers, graph processing frameworks and machine learning …

Combinatorial BLAS 2.0: Scaling combinatorial algorithms on distributed-memory systems

A Azad, O Selvitopi, MT Hussain… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
Combinatorial algorithms such as those that arise in graph analysis, modeling of discrete
systems, bioinformatics, and chemistry, are often hard to parallelize. The Combinatorial …

Space efficient sequence alignment for sram-based computing: X-drop on the graphcore IPU

L Burchard, MX Zhao, J Langguth, A Buluç… - Proceedings of the …, 2023 - dl.acm.org
Dedicated accelerator hardware has become essential for processing AI-based workloads,
leading to the rise of novel accelerator architectures. Furthermore, fundamental differences …

A tensor marshaling unit for sparse tensor algebra on general-purpose processors

M Siracusa, V Soria-Pardos, F Sgherzi… - Proceedings of the 56th …, 2023 - dl.acm.org
This paper proposes the Tensor Marshaling Unit (TMU), a near-core programmable dataflow
engine for multicore architectures that accelerates tensor traversals and merging, the most …

Distributed-memory parallel contig generation for de novo long-read genome assembly

G Guidi, G Raulet, D Rokhsar, L Oliker… - Proceedings of the 51st …, 2022 - dl.acm.org
De novo genome assembly, ie, rebuilding the sequence of an unknown genome from
redundant and erroneous short sequences, is a key but computationally intensive step in …

A novel method for temporal graph classification based on transitive reduction

C Jerônimo, ZKG Patrocínio… - 2023 IEEE 10th …, 2023 - ieeexplore.ieee.org
Domains such as bio-informatics, social network analysis, and computer vision, describe
relations between entities and cannot be interpreted as vectors or fixed grids, instead, they …

Generating Data Locality to Accelerate Sparse Matrix-Matrix Multiplication on CPUs

J Wolfson-Pou, J Laukemann, F Petrini - arxiv preprint arxiv:2501.07056, 2025 - arxiv.org
Sparse GEneral Matrix-matrix Multiplication (SpGEMM) is a critical operation in many
applications. Current multithreaded implementations are based on Gustavson's algorithm …

High-Performance Sorting-Based K-mer Counting in Distributed Memory with Flexible Hybrid Parallelism

Y Li, G Guidi - Proceedings of the 53rd International Conference on …, 2024 - dl.acm.org
In generating large quantities of DNA data, high-throughput sequencing technologies
require advanced bioinformatics infrastructures for efficient data analysis. k-mer counting …