DAMOV: A new methodology and benchmark suite for evaluating data movement bottlenecks
Data movement between the CPU and main memory is a first-order obstacle against improv
ing performance, scalability, and energy efficiency in modern systems. Computer systems …
ing performance, scalability, and energy efficiency in modern systems. Computer systems …
A survey of cache bypassing techniques
S Mittal - Journal of Low Power Electronics and Applications, 2016 - mdpi.com
With increasing core-count, the cache demand of modern processors has also increased.
However, due to strict area/power budgets and presence of poor data-locality workloads …
However, due to strict area/power budgets and presence of poor data-locality workloads …
Hardware/software cooperative caching for hybrid DRAM/NVM memory architectures
Non-Volatile Memory (NVM) has recently emerged for its nonvolatility, high density and
energy efficiency. Hybrid memory systems composed of DRAM and NVM have the best of …
energy efficiency. Hybrid memory systems composed of DRAM and NVM have the best of …
Exploiting inter-warp heterogeneity to improve GPGPU performance
In a GPU, all threads within a warp execute the same instruction in lockstep. For a memory
instruction, this can lead to memory divergence: the memory requests for some threads are …
instruction, this can lead to memory divergence: the memory requests for some threads are …
A survey of memory-centric energy efficient computer architecture
C Zhang, H Sun, S Li, Y Wang… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Energy efficient architecture is essential to improve both the performance and power
consumption of a computer system. However, modern computers suffer from the severe …
consumption of a computer system. However, modern computers suffer from the severe …
Dead page and dead block predictors: Cleaning tlbs and caches together
The last level TLB (LLT) and the last level cache (LLC) play a critical role in the overall
performance of memory-intensive applications. While management of LLC content has …
performance of memory-intensive applications. While management of LLC content has …
Acic: Admission-controlled instruction cache
Y Wang, CH Chang… - … Symposium on High …, 2023 - ieeexplore.ieee.org
The front end bottleneck in datacenter workloads has come under increased scrutiny, with
the growing code footprint, involvement of numerous libraries and OS services, and the …
the growing code footprint, involvement of numerous libraries and OS services, and the …
SLIP: reducing wire energy in the memory hierarchy
Wire energy has become the major contributor to energy in large lower level caches. While
wire energy is related to wire latency its costs are exposed differently in the memory …
wire energy is related to wire latency its costs are exposed differently in the memory …
Zero inclusion victim: Isolating core caches from inclusive last-level cache evictions
M Chaudhuri - 2021 ACM/IEEE 48th Annual International …, 2021 - ieeexplore.ieee.org
The most widely used last-level cache (LLC) architecture in the microprocessors has been
the inclusive LLC design. The popularity of the inclusive design stems from the bandwidth …
the inclusive LLC design. The popularity of the inclusive design stems from the bandwidth …
HBPB, applying reuse distance to improve cache efficiency proactively
Cache memories play a significant role in the performance, area, and energy consumption
of modern processors, and this impact is expected to grow as on-die memories become …
of modern processors, and this impact is expected to grow as on-die memories become …