DAMOV: A new methodology and benchmark suite for evaluating data movement bottlenecks

GF Oliveira, J Gómez-Luna, L Orosa, S Ghose… - IEEE …, 2021 - ieeexplore.ieee.org
Data movement between the CPU and main memory is a first-order obstacle against improv
ing performance, scalability, and energy efficiency in modern systems. Computer systems …

A survey of cache bypassing techniques

S Mittal - Journal of Low Power Electronics and Applications, 2016 - mdpi.com
With increasing core-count, the cache demand of modern processors has also increased.
However, due to strict area/power budgets and presence of poor data-locality workloads …

Hardware/software cooperative caching for hybrid DRAM/NVM memory architectures

H Liu, Y Chen, X Liao, H **, B He, L Zheng… - Proceedings of the …, 2017 - dl.acm.org
Non-Volatile Memory (NVM) has recently emerged for its nonvolatility, high density and
energy efficiency. Hybrid memory systems composed of DRAM and NVM have the best of …

Exploiting inter-warp heterogeneity to improve GPGPU performance

R Ausavarungnirun, S Ghose, O Kayiran… - 2015 International …, 2015 - ieeexplore.ieee.org
In a GPU, all threads within a warp execute the same instruction in lockstep. For a memory
instruction, this can lead to memory divergence: the memory requests for some threads are …

A survey of memory-centric energy efficient computer architecture

C Zhang, H Sun, S Li, Y Wang… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Energy efficient architecture is essential to improve both the performance and power
consumption of a computer system. However, modern computers suffer from the severe …

Dead page and dead block predictors: Cleaning tlbs and caches together

C Mazumdar, P Mitra, A Basu - 2021 IEEE International …, 2021 - ieeexplore.ieee.org
The last level TLB (LLT) and the last level cache (LLC) play a critical role in the overall
performance of memory-intensive applications. While management of LLC content has …

Acic: Admission-controlled instruction cache

Y Wang, CH Chang… - … Symposium on High …, 2023 - ieeexplore.ieee.org
The front end bottleneck in datacenter workloads has come under increased scrutiny, with
the growing code footprint, involvement of numerous libraries and OS services, and the …

SLIP: reducing wire energy in the memory hierarchy

S Das, TM Aamodt, WJ Dally - Proceedings of the 42nd Annual …, 2015 - dl.acm.org
Wire energy has become the major contributor to energy in large lower level caches. While
wire energy is related to wire latency its costs are exposed differently in the memory …

Zero inclusion victim: Isolating core caches from inclusive last-level cache evictions

M Chaudhuri - 2021 ACM/IEEE 48th Annual International …, 2021 - ieeexplore.ieee.org
The most widely used last-level cache (LLC) architecture in the microprocessors has been
the inclusive LLC design. The popularity of the inclusive design stems from the bandwidth …

HBPB, applying reuse distance to improve cache efficiency proactively

AM Krause, PC Santos, AF Lorenzon… - Journal of Parallel and …, 2024 - Elsevier
Cache memories play a significant role in the performance, area, and energy consumption
of modern processors, and this impact is expected to grow as on-die memories become …