- Academic Search

G Ballard, E Carson, J Demmel, M Hoemmen… - Acta Numerica, 2014 - cambridge.org

The traditional metric for the efficiency of a numerical algorithm has been the number of
arithmetic operations it performs. Technological trends have long been reducing the time to …

Salva Cita Citato da 146 Articoli correlati Tutte e 13 le versioni

[Free GPT-4]

[PDF] arxiv.org

Faster all-pairs shortest paths via circuit complexity

R Williams - Proceedings of the forty-sixth annual ACM symposium …, 2014 - dl.acm.org

We present a new randomized method for computing the min-plus product (aka, tropical
product) of two n× n matrices, yielding a faster algorithm for solving the all-pairs shortest …

Salva Cita Citato da 339 Articoli correlati Tutte e 13 le versioni

[Free GPT-4]

[PDF] github.io

AUGEM: automatically generate high performance dense linear algebra kernels on x86 CPUs

Q Wang, X Zhang, Y Zhang, Q Yi - Proceedings of the international …, 2013 - dl.acm.org

Basic Liner algebra subprograms (BLAS) is a fundamental library in scientific computing. In
this paper, we present a template-based optimization framework, AUGEM, which can …

Salva Cita Citato da 292 Articoli correlati Tutte e 10 le versioni

[Free GPT-4]

[PDF] arxiv.org

Algebraic methods in the congested clique

K Censor-Hillel, P Kaski, JH Korhonen… - Proceedings of the …, 2015 - dl.acm.org

In this work, we use algebraic methods for studying distance computation and subgraph
detection tasks in the congested clique model. Specifically, we adapt parallel matrix …

Salva Cita Citato da 157 Articoli correlati Tutte e 22 le versioni

[Free GPT-4]

[PDF] psu.edu

Communication-optimal parallel recursive rectangular matrix multiplication

J Demmel, D Eliahu, A Fox, S Kamil… - 2013 IEEE 27th …, 2013 - ieeexplore.ieee.org

Communication-optimal algorithms are known for square matrix multiplication. Here, we
obtain the first communication-optimal algorithm for all dimensions of rectangular matrices …

Salva Cita Citato da 156 Articoli correlati Tutte e 13 le versioni

[Free GPT-4]

[PDF] arxiv.org

A framework for practical parallel fast matrix multiplication

AR Benson, G Ballard - ACM SIGPLAN Notices, 2015 - dl.acm.org

Matrix multiplication is a fundamental computation in many scientific disciplines. In this
paper, we show that novel fast matrix multiplication algorithms can significantly outperform …

Salva Cita Citato da 118 Articoli correlati Tutte e 10 le versioni

[Free GPT-4]

[PDF] arxiv.org

Graph expansion and communication costs of fast matrix multiplication

G Ballard, J Demmel, O Holtz, O Schwartz - Journal of the ACM (JACM), 2013 - dl.acm.org

The communication cost of algorithms (also known as I/O-complexity) is shown to be closely
related to the expansion properties of the corresponding computation graphs. We …

Salva Cita Citato da 135 Articoli correlati Tutte e 25 le versioni

[Free GPT-4]

[PDF] berkeley.edu

Communication optimal parallel multiplication of sparse random matrices

G Ballard, A Buluc, J Demmel, L Grigori… - Proceedings of the …, 2013 - dl.acm.org

Parallel algorithms for sparse matrix-matrix multiplication typically spend most of their time
on inter-processor communication rather than on computation, and hardware trends predict …

Salva Cita Citato da 119 Articoli correlati Tutte e 14 le versioni

[Free GPT-4]

[PDF] siam.org

Pebbling Game and Alternative Basis for High Performance Matrix Multiplication

O Schwartz, N Vaknin - SIAM Journal on Scientific Computing, 2023 - SIAM

Matrix multiplication is one of the most extensively used kernels in scientific computing.
Although subcubic algorithms exist, most high performance implementations are based on …

Salva Cita Citato da 7 Articoli correlati Tutte e 2 le versioni

[Free GPT-4]

[PDF] arxiv.org

Scalable graph convolutional network training on distributed-memory systems

GV Demirci, A Haldar, H Ferhatosmanoglu - arxiv preprint arxiv …, 2022 - arxiv.org

Graph Convolutional Networks (GCNs) are extensively utilized for deep learning on graphs.
The large data sizes of graphs and their vertex features make scalable training algorithms …

Salva Cita Citato da 15 Articoli correlati Tutte e 10 le versioni Versione HTML

Crea avviso

Cita

Ricerca avanzata

Salvato in La mia biblioteca

Communication-optimal parallel algorithm for strassen's matrix multiplication

Communication lower bounds and optimal algorithms for numerical linear algebra

Faster all-pairs shortest paths via circuit complexity

AUGEM: automatically generate high performance dense linear algebra kernels on x86 CPUs

Algebraic methods in the congested clique

Communication-optimal parallel recursive rectangular matrix multiplication

A framework for practical parallel fast matrix multiplication

Graph expansion and communication costs of fast matrix multiplication

Communication optimal parallel multiplication of sparse random matrices

Pebbling Game and Alternative Basis for High Performance Matrix Multiplication

Scalable graph convolutional network training on distributed-memory systems