- Academic Search

E Elmroth, F Gustavson, I Jonsson, B Kågström - SIAM review, 2004 - SIAM

Matrix computations are both fundamental and ubiquitous in computational science and its
vast application areas. Along with the development of more advanced computer systems …

Tallenna Viittaa Viittausten määrä 262 Aiheeseen liittyviä artikkeleita Kaikki 23 versiota

[Free GPT-4]
[DeepSeek]

[PDF] psu.edu

[KIRJA][B] Automatic performance tuning of sparse matrix kernels

RW Vuduc - 2003 - search.proquest.com

This dissertation presents an automated system to generate highly efficient, platform-
adapted implementations of sparse matrix kernels. We show that conventional …

Tallenna Viittaa Viittausten määrä 365 Aiheeseen liittyviä artikkeleita Kaikki 11 versiota Kirjastohaku

[Free GPT-4]
[DeepSeek]

[PDF] psu.edu

Tiling optimizations for 3D scientific computations

G Rivera, CW Tseng - SC'00: Proceedings of the 2000 ACM …, 2000 - ieeexplore.ieee.org

Compiler transformations can significantly improve data locality for many scientific programs.
In this paper, we show iterative solvers for partial differential equations (PDEs) in three …

Tallenna Viittaa Viittausten määrä 315 Aiheeseen liittyviä artikkeleita Kaikki 9 versiota

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Program locality analysis using reuse distance

Y Zhong, X Shen, C Ding - ACM Transactions on Programming …, 2009 - dl.acm.org

On modern computer systems, the memory performance of an application depends on its
locality. For a single execution, locality-correlated measures like average miss rate or …

Tallenna Viittaa Viittausten määrä 207 Aiheeseen liittyviä artikkeleita Kaikki 12 versiota

[Free GPT-4]
[DeepSeek]

[PDF] herts.ac.uk

Single Assignment C: efficient support for high-level array operations in a functional setting

SB Scholz - Journal of functional programming, 2003 - cambridge.org

This paper presents a novel approach for integrating arrays with access time (1) into
functional languages. It introduces n-dimensional arrays combined with a type system that …

Tallenna Viittaa Viittausten määrä 219 Aiheeseen liittyviä artikkeleita Kaikki 18 versiota

[Free GPT-4]
[DeepSeek]

[PDF] umd.edu Full View

Heap data allocation to scratch-pad memory in embedded systems

A Dominguez, S Udayakumaran… - Journal of Embedded …, 2005 - content.iospress.com

This paper presents the first-ever compile-time method for allocating a portion of the heap
data to scratch-pad memory. A scratch-pad is a fast directly addressed compiler-managed …

Tallenna Viittaa Viittausten määrä 199 Aiheeseen liittyviä artikkeleita Kaikki 12 versiota Kirjastohaku

[Free GPT-4]
[DeepSeek]

[PDF] psu.edu

Tiling, block data layout, and memory hierarchy performance

N Park, B Hong, VK Prasanna - IEEE Transactions on Parallel …, 2003 - ieeexplore.ieee.org

Recently, several experimental studies have been conducted on block data layout in
conjunction with tiling as a data transformation technique to improve cache performance. In …

Tallenna Viittaa Viittausten määrä 157 Aiheeseen liittyviä artikkeleita Kaikki 9 versiota

[Free GPT-4]
[DeepSeek]

[HTML] acm.org

Statistical models for empirical search-based performance tuning

R Vuduc, JW Demmel… - The International Journal …, 2004 - journals.sagepub.com

Achieving peak performance from the computational kernels that dominate application
performance often requires extensive machine-dependent tuning by hand. Automatic tuning …

Tallenna Viittaa Viittausten määrä 155 Aiheeseen liittyviä artikkeleita Kaikki 8 versiota

[Free GPT-4]
[DeepSeek]

[PDF] rochester.edu

Improving effective bandwidth through compiler enhancement of global cache reuse

C Ding, K Kennedy - Journal of Parallel and Distributed Computing, 2004 - Elsevier

The performance of modern machines is increasingly limited by insufficient memory
bandwidth. One way to alleviate this bandwidth limitation for a given program is to minimize …

Tallenna Viittaa Viittausten määrä 151 Aiheeseen liittyviä artikkeleita Kaikki 15 versiota

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Synthesizing transformations for locality enhancement of imperfectly-nested loop nests

N Ahmed, N Mateev, K **ali - … of the 14th international conference on …, 2000 - dl.acm.org

We present an approach for synthesizing transformations to enhance locality in imperfectly-
nested loops. The key idea is to embed the iteration space of every statement in a loop nest …

Tallenna Viittaa Viittausten määrä 156 Aiheeseen liittyviä artikkeleita Kaikki 22 versiota

Luo ilmoitus

Viittaa

Tarkennettu haku

Tallennettu omaan kirjastoon

Transforming loops to recursion for multi-level memory hierarchies

Recursive blocked algorithms and hybrid data structures for dense matrix library software

[KIRJA][B] Automatic performance tuning of sparse matrix kernels

Tiling optimizations for 3D scientific computations

Program locality analysis using reuse distance

Single Assignment C: efficient support for high-level array operations in a functional setting

Heap data allocation to scratch-pad memory in embedded systems

Tiling, block data layout, and memory hierarchy performance

Statistical models for empirical search-based performance tuning

Improving effective bandwidth through compiler enhancement of global cache reuse

Synthesizing transformations for locality enhancement of imperfectly-nested loop nests