- Academic Search

TA Davis, S Rajamanickam, WM Sid-Lakhdar - Acta Numerica, 2016 - cambridge.org

Wilkinson defined a sparse matrix as one with enough zeros that it pays to take advantage of
them. 1 This informal yet practical definition captures the essence of the goal of direct …

Save Cite Cited by 321 Related articles All 9 versions Free GPT-4

[Free GPT-4]

[PDF] hal.science

StarPU-MPI: Task programming over clusters of machines enhanced with accelerators

C Augonnet, O Aumage, N Furmento, R Namyst… - Recent Advances in the …, 2012 - Springer

GPUs clusters are becoming widespread HPC platforms. Exploiting them is however
challenging, as this requires two separate paradigms (MPI and CUDA or OpenCL) and …

Save Cite Cited by 102 Related articles All 19 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Taking advantage of hybrid systems for sparse direct solvers via task-based runtimes

X Lacoste, M Faverge, G Bosilca… - … Parallel & Distributed …, 2014 - ieeexplore.ieee.org

The ongoing hardware evolution exhibits an escalation in the number, as well as in the
heterogeneity, of computing resources. The pressure to maintain reasonable levels of …

Save Cite Cited by 74 Related articles All 22 versions Free GPT-4

[Free GPT-4]

[PDF] ucd.ie

Data partitioning on heterogeneous multicore and multi-GPU systems using functional performance models of data-parallel applications

Z Zhong, V Rychkov… - 2012 IEEE international …, 2012 - ieeexplore.ieee.org

Transition to hybrid CPU/GPU platforms in high performance computing is challenging in the
aspect of efficient utilisation of the heterogeneous hardware and existing optimised software …

Save Cite Cited by 76 Related articles All 9 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Exploiting symmetry in tensors for high performance: Multiplication with symmetric tensors

MD Schatz, TM Low, RA van de Geijn, TG Kolda - SIAM Journal on Scientific …, 2014 - SIAM

Symmetric tensor operations arise in a wide variety of computations. However, the benefits
of exploiting symmetry in order to reduce storage and computation is in conflict with a desire …

Save Cite Cited by 59 Related articles All 8 versions Free GPT-4

[Free GPT-4]

[PDF] utexas.edu

Unleashing the high-performance and low-power of multi-core DSPs for general-purpose HPC

FD Igual, M Ali, A Friedmann, E Stotzer… - SC'12: Proceedings …, 2012 - ieeexplore.ieee.org

Take a multicore Digital Signal Processor (DSP) chip designed for cellular base stations and
radio network controllers, add floating-point capabilities to support 4G networks, and out of …

Save Cite Cited by 47 Related articles All 14 versions Free GPT-4

[Free GPT-4]

[PDF] acm.org

Optimizing tensor contractions in ccsd (t) for efficient execution on gpus

J Kim, A Sukumaran-Rajam, C Hong… - Proceedings of the …, 2018 - dl.acm.org

Tensor contractions are higher dimensional analogs of matrix multiplications, used in many
computational contexts such as high order models in quantum chemistry, deep learning …

Save Cite Cited by 25 Related articles All 4 versions Free GPT-4

[Free GPT-4]

[PDF] hal.science

Scheduling and memory optimizations for sparse direct solver on multi-core/multi-GPU duster systems

X Lacoste - 2015 - theses.hal.science

The ongoing hardware evolution exhibits an escalation in the number, as well as in the
heterogeneity, of computing resources. The pressure to maintain reasonable levels of …

Save Cite Cited by 35 Related articles All 6 versions Free GPT-4 Library Search View as HTML

[Free GPT-4]

[PDF] hal.science

Bridging the gap between performance and bounds of cholesky factorization on heterogeneous platforms

E Agullo, O Beaumont, L Eyraud-Dubois… - 2015 IEEE …, 2015 - ieeexplore.ieee.org

We consider the problem of allocating and scheduling dense linear application on fully
heterogeneous platforms made of CPUs and GPUs. More specifically, we focus on the …

Save Cite Cited by 25 Related articles All 13 versions Free GPT-4

[Free GPT-4]

[PDF] qub.ac.uk

Improving the user experience of the rCUDA remote GPU virtualization framework

C Reano, F Silla, A Castelló, AJ Pena… - Concurrency and …, 2015 - Wiley Online Library

Graphics processing units (GPUs) are being increasingly embraced by the high‐
performance computing community as an effective way to reduce execution time by …

Save Cite Cited by 24 Related articles All 12 versions Free GPT-4

Create alert

Cite

Advanced search

Saved to My library

The FLAME approach: From dense linear algebra algorithms to high-performance multi-accelerator...

A survey of direct methods for sparse linear systems

StarPU-MPI: Task programming over clusters of machines enhanced with accelerators

Taking advantage of hybrid systems for sparse direct solvers via task-based runtimes

Data partitioning on heterogeneous multicore and multi-GPU systems using functional performance models of data-parallel applications

Exploiting symmetry in tensors for high performance: Multiplication with symmetric tensors

Unleashing the high-performance and low-power of multi-core DSPs for general-purpose HPC

Optimizing tensor contractions in ccsd (t) for efficient execution on gpus

Scheduling and memory optimizations for sparse direct solver on multi-core/multi-GPU duster systems

Bridging the gap between performance and bounds of cholesky factorization on heterogeneous platforms

Improving the user experience of the rCUDA remote GPU virtualization framework