Cache-oblivious algorithms

M Frigo, CE Leiserson, H Prokop… - … on Foundations of …, 1999 - ieeexplore.ieee.org
This paper presents asymptotically optimal algorithms for rectangular matrix transpose, FFT,
and sorting on computers with multiple levels of caching. Unlike previous optimal algorithms …

[BOOK][B] Space-filling curves: an introduction with applications in scientific computing

M Bader - 2012 - books.google.com
The present book provides an introduction to using space-filling curves (SFC) as tools in
scientific computing. Special focus is laid on the representation of SFC and on resulting …

Tiling optimizations for 3D scientific computations

G Rivera, CW Tseng - SC'00: Proceedings of the 2000 ACM …, 2000 - ieeexplore.ieee.org
Compiler transformations can significantly improve data locality for many scientific programs.
In this paper, we show iterative solvers for partial differential equations (PDEs) in three …

Cache-oblivious algorithms

M Frigo, CE Leiserson, H Prokop… - ACM Transactions on …, 2012 - dl.acm.org
This article presents asymptotically optimal algorithms for rectangular matrix transpose, fast
Fourier transform (FFT), and sorting on computers with multiple levels of caching. Unlike …

Recursive array layouts and fast parallel matrix multiplication

S Chatterjee, AR Lebeck, PK Patnala… - Proceedings of the …, 1999 - dl.acm.org
Matrix multiplication is an important kernel in linear algebra algorithms, and the performance
of both serial and parallel implementations is highly dependent on the memory system …

Optimizing graph algorithms for improved cache performance

JS Park, M Penner, VK Prasanna - IEEE Transactions on …, 2004 - ieeexplore.ieee.org
We develop algorithmic optimizations to improve the cache performance of four fundamental
graph algorithms. We present a cache-oblivious implementation of the Floyd-Warshall …

Exact analysis of the cache behavior of nested loops

S Chatterjee, E Parker, PJ Hanlon, AR Lebeck - ACM SIGPLAN Notices, 2001 - dl.acm.org
We develop from first principles an exact model of the behavior of loop nests executing in a
memory hicrarchy, by using a nontraditional classification of misses that has the key property …

[PDF][PDF] Cache oblivious search trees via binary trees of small height

GS Brodal, R Fagerberg, R Jacob - BRICS Report Series, 2001 - brics.dk
We propose a version of cache oblivious search trees which is simpler than the previous
proposal of Bender, Demaine and Farach-Colton and has the same complexity bounds. In …

Data cache locking for higher program predictability

X Vera, B Lisper, J Xue - ACM SIGMETRICS Performance Evaluation …, 2003 - dl.acm.org
Caches have become increasingly important with the widening gap between main memory
and processor speeds. However, they are a source of unpredictability due to their …

Memory coloring: A compiler approach for scratchpad memory management

L Li, L Gao, J Xue - 14th International Conference on Parallel …, 2005 - ieeexplore.ieee.org
Scratchpad memory (SPM), a fast software-managed on-chip SRAM, is now widely used in
modern embedded processors. Compared to hardware-managed cache, it is more efficient …