A survey of techniques for cache partitioning in multicore processors

S Mittal - ACM Computing Surveys (CSUR), 2017 - dl.acm.org
As the number of on-chip cores and memory demands of applications increase, judicious
management of cache resources has become not merely attractive but imperative. Cache …

Utility-based cache partitioning: A low-overhead, high-performance, runtime mechanism to partition shared caches

MK Qureshi, YN Patt - 2006 39th Annual IEEE/ACM …, 2006 - ieeexplore.ieee.org
This paper investigates the problem of partitioning a shared cache between multiple
concurrently executing applications. The commonly used LRU policy implicitly partitions a …

Optimizing NUCA organizations and wiring alternatives for large caches with CACTI 6.0

N Muralimanohar, R Balasubramonian… - 40th Annual IEEE …, 2007 - ieeexplore.ieee.org
A significant part of future microprocessor real estate will be dedicated to 12 or 13 caches.
These on-chip caches will heavily impact processor performance, power dissipation, and …

Reactive NUCA: near-optimal block placement and replication in distributed caches

N Hardavellas, M Ferdman, B Falsafi… - Proceedings of the 36th …, 2009 - dl.acm.org
Increases in on-chip communication delay and the large working sets of server and scientific
workloads complicate the design of the on-chip last-level cache for multicore processors …

Scale-out processors

P Lotfi-Kamran, B Grot, M Ferdman, S Volos… - ACM SIGARCH …, 2012 - dl.acm.org
Scale-out datacenters mandate high per-server throughput to get the maximum benefit from
the large TCO investment. Emerging applications (eg, data serving and web search) that run …

Managing distributed, shared L2 caches through OS-level page allocation

S Cho, L ** - 2006 39th Annual IEEE/ACM International …, 2006 - ieeexplore.ieee.org
This paper presents and studies a distributed L2 cache management approach through OS-
level page allocation for future many-core processors. L2 cache management is a crucial …

ATAC: A 1000-core cache-coherent processor with on-chip optical network

G Kurian, JE Miller, J Psota, J Eastep, J Liu… - Proceedings of the 19th …, 2010 - dl.acm.org
Based on current trends, multicore processors will have 1000 cores or more within the next
decade. However, their promise of increased performance will only be realized if their …

A NUCA substrate for flexible CMP cache sharing

J Huh, C Kim, H Shafi, L Zhang, D Burger… - ACM International …, 2005 - dl.acm.org
We propose an organization for the on-chip memory system of a chip multiprocessor, in
which 16 processors share a 16MB pool of 256 L2 cache banks. The L2 cache is organized …

Cooperative cache partitioning for chip multiprocessors

J Chang, GS Sohi - ACM International Conference on Supercomputing …, 2007 - dl.acm.org
This paper presents Cooperative Cache Partitioning (CCP) to allocate cache resources
among threads concurrently running on CMPs. Unlike cache partitioning schemes that use a …

ASR: Adaptive selective replication for CMP caches

BM Beckmann, MR Marty… - 2006 39th Annual IEEE …, 2006 - ieeexplore.ieee.org
The large working sets of commercial and scientific workloads stress the L2 caches of chip
multiprocessors (CMPs). Some CMPs use a shared L2 cache to maximize the on-chip cache …