Data locality in high performance computing, big data, and converged systems: An analysis of the cutting edge and a future system architecture
Big data has revolutionized science and technology leading to the transformation of our
societies. High-performance computing (HPC) provides the necessary computational power …
societies. High-performance computing (HPC) provides the necessary computational power …
Adaptive insertion policies for high performance caching
The commonly used LRU replacement policy is susceptible to thrashing for memory-
intensive workloads that have a working set greater than the available cache size. For such …
intensive workloads that have a working set greater than the available cache size. For such …
An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches
Growing wire delays will force substantive changes in the designs of large caches.
Traditional cache architectures assume that each level in the cache hierarchy has a single …
Traditional cache architectures assume that each level in the cache hierarchy has a single …
Back to the future: Leveraging Belady's algorithm for improved cache replacement
Belady's algorithm is optimal but infeasible because it requires knowledge of the future. This
paper explains how a cache replacement algorithm can nonetheless learn from Belady's …
paper explains how a cache replacement algorithm can nonetheless learn from Belady's …
[책][B] Understanding the Linux virtual memory manager
M Gorman - 2004 - eecg.utoronto.ca
Linux is a relatively new operating system that has begun to enjoy a lot of attention from the
business and academic worlds. As the operating system matures, its feature set, capabilities …
business and academic worlds. As the operating system matures, its feature set, capabilities …
Spatial memory streaming
Prior research indicates that there is much spatial variation in applications' memory access
patterns. Modern memory systems, however, use small fixed-size cache blocks and as such …
patterns. Modern memory systems, however, use small fixed-size cache blocks and as such …
[책][B] Multiprocessor systems-on-chips
Modern system-on-chip (SoC) design shows a clear trend toward integration of multiple
processor cores on a single chip. Designing a multiprocessor system-on-chip (MPSOC) …
processor cores on a single chip. Designing a multiprocessor system-on-chip (MPSOC) …
Row buffer locality aware caching policies for hybrid memories
Phase change memory (PCM) is a promising technology that can offer higher capacity than
DRAM. Unfortunately, PCM's access latency and energy are higher than DRAM's and its …
DRAM. Unfortunately, PCM's access latency and energy are higher than DRAM's and its …
Data and memory optimization techniques for embedded systems
We present a survey of the state-of-the-art techniques used in performing data and memory-
related optimizations in embedded systems. The optimizations are targeted directly or …
related optimizations in embedded systems. The optimizations are targeted directly or …
Counter-based cache replacement and bypassing algorithms
Recent studies have shown that, in highly associative caches, the performance gap between
the least recently used (LRU) and the theoretical optimal replacement algorithms is large …
the least recently used (LRU) and the theoretical optimal replacement algorithms is large …