- Academic Search

W Liu, B Vinter - Journal of Parallel and Distributed Computing, 2015 - Elsevier

General sparse matrix–matrix multiplication (SpGEMM) is a fundamental building block for
numerous applications such as algebraic multigrid method (AMG), breadth first search and …

Enregistrer Citer Cité 117 fois Autres articles Les 7 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Hypergraph partitioning for sparse matrix-matrix multiplication

G Ballard, A Druinsky, N Knight… - ACM Transactions on …, 2016 - dl.acm.org

We propose a fine-grained hypergraph model for sparse matrix-matrix multiplication
(SpGEMM), a key computational kernel in scientific computing and data analysis whose …

Enregistrer Citer Cité 72 fois Autres articles Les 5 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

The input/output complexity of triangle enumeration

R Pagh, F Silvestri - Proceedings of the 33rd ACM SIGMOD-SIGACT …, 2014 - dl.acm.org

We consider the well-known problem of enumerating all triangles of an undirected graph.
Our focus is on determining the input/output (I/O) complexity of this problem. Let E be the …

Enregistrer Citer Cité 85 fois Autres articles Les 8 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] uwaterloo.ca

Fast Matrix Multiplication for Query Processing

X Hu - Proceedings of the ACM on Management of Data, 2024 - dl.acm.org

This paper studies how to use fast matrix multiplication to speed up query processing. As
observed, computing a two-table join and then projecting away the join attribute is …

Enregistrer Citer Cité 4 fois Autres articles Les 2 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Hypergraph Partitioning for Parallel Sparse Matrix-Matrix Multiplication

G Ballard, A Druinsky, N Knight… - Proceedings of the 27th …, 2015 - dl.acm.org

The performance of parallel algorithms for sparse matrix-matrix multiplication is typically
determined by the amount of interprocessor communication performed, which in turn …

Enregistrer Citer Cité 33 fois Autres articles Les 8 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] ku.dk

[PDF][PDF] Parallel and scalable sparse basic linear algebra subprograms

W Liu - 2015 - nbi.ku.dk

Sparse basic linear algebra subprograms (BLAS) are fundamental building blocks for
numerous scientific computations and graph applications. Compared with Dense BLAS …

Enregistrer Citer Cité 28 fois Autres articles Recherche dans les bibliothèques Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] acm.org Full View

Optimization of Sparse Matrix Computation for Algebraic Multigrid on GPUs

Y Wang, F Chang, B Wei, J Gao, W Ji - ACM Transactions on Architecture …, 2024 - dl.acm.org

AMG is one of the most efficient and widely used methods for solving sparse linear systems.
The computational process of AMG mainly consists of a series of iterative calculations of …

Enregistrer Citer Cité 1 fois Autres articles

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

The I/O complexity of Strassen's matrix multiplication with recomputation

G Bilardi, L De Stefani - Workshop on Algorithms and Data Structures, 2017 - Springer

Abstract A tight\varOmega ((n/M)^\log _2 7 M) lower bound is derived on the I/O complexity
of Strassen's algorithm to multiply two n * n matrices, in a two-level storage hierarchy with M …

Enregistrer Citer Cité 25 fois Autres articles Les 8 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Fine-grained I/O complexity via reductions: New lower bounds, faster algorithms, and a time hierarchy

ED Demaine, A Lincoln, QC Liu, J Lynch… - arxiv preprint arxiv …, 2017 - arxiv.org

This paper initiates the study of I/O algorithms (minimizing cache misses) from the
perspective of fine-grained complexity (conditional polynomial lower bounds). Specifically …

Enregistrer Citer Cité 16 fois Autres articles Les 10 versions Free GPT-4 DeepSeek Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

The I/O Complexity of Attention, or How Optimal is Flash Attention?

B Saha, C Ye - arxiv preprint arxiv:2402.07443, 2024 - arxiv.org

Self-attention is at the heart of the popular Transformer architecture, yet suffers from
quadratic time and memory complexity. The breakthrough FlashAttention algorithm revealed …

Enregistrer Citer Autres articles Les 3 versions Free GPT-4 DeepSeek Version HTML

Créer l'alerte

Citer

Recherche avancée

Enregistré dans Ma bibliothèque

The input/output complexity of sparse matrix multiplication

A framework for general sparse matrix–matrix multiplication on GPUs and heterogeneous processors

Hypergraph partitioning for sparse matrix-matrix multiplication

The input/output complexity of triangle enumeration

Fast Matrix Multiplication for Query Processing

Hypergraph Partitioning for Parallel Sparse Matrix-Matrix Multiplication

[PDF][PDF] Parallel and scalable sparse basic linear algebra subprograms

Optimization of Sparse Matrix Computation for Algebraic Multigrid on GPUs

The I/O complexity of Strassen's matrix multiplication with recomputation

Fine-grained I/O complexity via reductions: New lower bounds, faster algorithms, and a time hierarchy

The I/O Complexity of Attention, or How Optimal is Flash Attention?