A survey of recent prefetching techniques for processor caches
S Mittal - ACM Computing Surveys (CSUR), 2016 - dl.acm.org
As the trends of process scaling make memory systems an even more crucial bottleneck, the
importance of latency hiding techniques such as prefetching grows further. However, naively …
importance of latency hiding techniques such as prefetching grows further. However, naively …
Performance cloning: A technique for disseminating proprietary applications as benchmarks
Many embedded real world applications are intellectual property, and vendors hesitate to
share these proprietary applications with computer architects and designers. This poses a …
share these proprietary applications with computer architects and designers. This poses a …
Fast thread migration via cache working set prediction
The most significant source of lost performance when a thread migrates between cores is
the loss of cache state. A significant boost in post-migration performance is possible if the …
the loss of cache state. A significant boost in post-migration performance is possible if the …
Prefetch injection based on hardware monitoring and object metadata
AR Adl-Tabatabai, RL Hudson, MJ Serrano… - ACM SIGPLAN …, 2004 - dl.acm.org
Cache miss stalls hurt performance because of the large gap between memory and
processor speeds-for example, the popular server benchmark SPEC JBB2000 spends 45 …
processor speeds-for example, the popular server benchmark SPEC JBB2000 spends 45 …
Distilling the essence of proprietary workloads into miniature benchmarks
Benchmarks set standards for innovation in computer architecture research and industry
product development. Consequently, it is of paramount importance that these workloads are …
product development. Consequently, it is of paramount importance that these workloads are …
On the limits of leakage power reduction in caches
If current technology scaling trends hold, leakage power dissipation soon becomes the
dominant source of power consumption. Caches, due to the fact that they account for the …
dominant source of power consumption. Caches, due to the fact that they account for the …
A decoupled predictor-directed stream prefetching architecture
An effective method for reducing the effect of load latency in modern processors is data
prefetching. One form of hardware-based data prefetching, stream buffers, has been shown …
prefetching. One form of hardware-based data prefetching, stream buffers, has been shown …
Scheduling for energy efficiency and fault tolerance in hard real-time systems
Y Liu, H Liang, K Wu - 2010 Design, Automation & Test in …, 2010 - ieeexplore.ieee.org
This paper studies the dilemma between fault tolerance and energy efficiency in frame-
based real-time systems. Given a set of K tasks to be executed on a system that supports L …
based real-time systems. Given a set of K tasks to be executed on a system that supports L …
[BOOK][B] Constructing adaptable and scalable synthetic benchmarks for microprocessor performance evaluation
AM Joshi - 2007 - search.proquest.com
Benchmarks set standards for innovation in computer architecture research and industry
product development. Consequently, it is of paramount importance that the benchmarks …
product development. Consequently, it is of paramount importance that the benchmarks …
Data prefetching in a cache hierarchy with high bandwidth and capacity
In this paper we evaluate four hardware data prefetchers in the context of a high-
performance three-level on chip cache hierarchy with high bandwidth and capacity. We …
performance three-level on chip cache hierarchy with high bandwidth and capacity. We …