[BOK][B] Memory systems: cache, DRAM, disk

B Jacob, D Wang, S Ng - 2010 - books.google.com
Is your memory hierarchy stop** your microprocessor from performing at the high level it
should be? Memory Systems: Cache, DRAM, Disk shows you how to resolve this problem …

Predicting whole-program locality through reuse distance analysis

C Ding, Y Zhong - Proceedings of the ACM SIGPLAN 2003 conference …, 2003 - dl.acm.org
Profiling can accurately analyze program behavior for select data inputs. We show that
profiling can also predict program locality for inputs other than profiled ones. Here locality is …

Locality phase prediction

X Shen, Y Zhong, C Ding - ACM SIGPLAN Notices, 2004 - dl.acm.org
As computer memory hierarchy becomes adaptive, its performance increasingly depends on
forecasting the dynamic program locality. This paper presents a method that predicts the …

Improving cache performance in dynamic applications through data and computation reorganization at run time

C Ding, K Kennedy - ACM SIGPLAN Notices, 1999 - dl.acm.org
With the rapid improvement of processor speed, performance of the memory hierarchy has
become the principal bottleneck for most applications. A number of compiler transformations …

Program locality analysis using reuse distance

Y Zhong, X Shen, C Ding - ACM Transactions on Programming …, 2009 - dl.acm.org
On modern computer systems, the memory performance of an application depends on its
locality. For a single execution, locality-correlated measures like average miss rate or …

[BOK][B] The compiler design handbook: optimizations and machine code generation

YN Srikant, P Shankar - 2002 - taylorfrancis.com
The widespread use of object-oriented languages and Internet security concerns are just the
beginning. Add embedded systems, multiple memory banks, highly pipelined units …

Loci: A rule-based framework for parallel multi-disciplinary simulation synthesis

EA Luke, T George - Journal of Functional Programming, 2005 - cambridge.org
We present a rule-based framework for the development of scalable parallel high
performance simulations for a broad class of scientific applications (with particular emphasis …

Array regrou** and structure splitting using whole-program reference affinity

Y Zhong, M Orlovich, X Shen, C Ding - ACM SIGPLAN Notices, 2004 - dl.acm.org
While the memory of most machines is organized as a hierarchy, program data are laid out
in a uniform address space. This paper defines a model of reference affinity, which …

Compile-time composition of run-time data and iteration reorderings

MM Strout, L Carter, J Ferrante - Proceedings of the ACM SIGPLAN 2003 …, 2003 - dl.acm.org
Many important applications, such as those using sparse data structures, have memory
reference patterns that are unknown at compile-time. Prior work has developed run-time …

swSpTRSV: A fast sparse triangular solve with sparse level tile layout on sunway architectures

X Wang, W Liu, W Xue, L Wu - Proceedings of the 23rd ACM SIGPLAN …, 2018 - dl.acm.org
Sparse triangular solve (SpTRSV) is one of the most important kernels in many real-world
applications. Currently, much research on parallel SpTRSV focuses on level-set construction …