Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Communication-optimal parallel algorithm for strassen's matrix multiplication
Parallel matrix multiplication is one of the most studied fundamental problems in distributed
and high performance computing. We obtain a new parallel algorithm that is based on …
and high performance computing. We obtain a new parallel algorithm that is based on …
Graph expansion and communication costs of fast matrix multiplication
The communication cost of algorithms (also known as I/O-complexity) is shown to be closely
related to the expansion properties of the corresponding computation graphs. We …
related to the expansion properties of the corresponding computation graphs. We …
Matrix multiplication, a little faster
E Karstadt, O Schwartz - Journal of the ACM (JACM), 2020 - dl.acm.org
Strassen's algorithm (1969) was the first sub-cubic matrix multiplication algorithm. Winograd
(1971) improved the leading coefficient of its complexity from 6 to 7. There have been many …
(1971) improved the leading coefficient of its complexity from 6 to 7. There have been many …
Pebbling game and alternative basis for high performance matrix multiplication
Matrix multiplication is one of the most extensively used kernels in scientific computing.
Although subcubic algorithms exist, most high performance implementations are based on …
Although subcubic algorithms exist, most high performance implementations are based on …
A Matrix–Matrix Multiplication methodology for single/multi-core architectures using SIMD
In this paper, a new methodology for speeding up Matrix–Matrix Multiplication using Single
Instruction Multiple Data unit, at one and more cores having a shared cache, is presented …
Instruction Multiple Data unit, at one and more cores having a shared cache, is presented …
Multifrontal methods: parallelism, memory usage and numerical aspects
JY L'Excellent - 2012 - theses.hal.science
Direct methods for the solution of sparse systems of linear equations are used in a wide
range of numerical simulation applications. Such methods are based on the decomposition …
range of numerical simulation applications. Such methods are based on the decomposition …
A high-performance matrix–matrix multiplication methodology for CPU and GPU architectures
Current compilers cannot generate code that can compete with hand-tuned code in
efficiency, even for a simple kernel like matrix–matrix multiplication (MMM). A key step in …
efficiency, even for a simple kernel like matrix–matrix multiplication (MMM). A key step in …
Stark: Fast and scalable strassen's matrix multiplication using apache spark
This article presents a new fast, highly scalable distributed matrix multiplication algorithm on
Apache Spark, called Stark, based on Strassen's matrix multiplication algorithm. Stark …
Apache Spark, called Stark, based on Strassen's matrix multiplication algorithm. Stark …
Alternative Basis Matrix Multiplication is fast and stable
Alternative basis matrix multiplication algorithms are the fastest matrix multiplication
algorithms in practice to date. However, are they numerically stable? We obtain the first …
algorithms in practice to date. However, are they numerically stable? We obtain the first …
[KÖNYV][B] Avoiding communication in dense linear algebra
GM Ballard - 2013 - search.proquest.com
Dense linear algebra computations are essential to nearly every problem in scientific
computing and to countless other fields. Most matrix computations enjoy a high …
computing and to countless other fields. Most matrix computations enjoy a high …