Blenda: Dynamically-Reconfigurable Stacked DRAM

M Bakhshalipour, H Zare, F Samandi… - 2024 57th IEEE/ACM …, 2024‏ - ieeexplore.ieee.org
This paper proposes Blenda, a dynamically-partitioned memory-cache blend architecture for
giga-scale die-stacked DRAMs. Blenda architects the stacked DRAM partly as memory and …

Multi-level Memory-Centric Profiling on ARM Processors with ARM SPE

S Miksits, R Shi, M Gokhale, J Wahlgren… - SC24-W: Workshops …, 2024‏ - ieeexplore.ieee.org
High-end ARM processors are emerging in data centers and HPC systems, posing as a
strong contender to x86 machines. Memory-centric profiling is an important approach for …

A Comprehensive Simulation Framework for CXL Disaggregated Memory

Y Wang, L Wu, W Hong, Y Ou, Z Wang, S Gao… - arxiv preprint arxiv …, 2024‏ - arxiv.org
Compute eXpress Link (CXL) is a pivotal technology for memory disaggregation in future
heterogeneous computing systems, enabling on-demand memory expansion and improved …

Measuring Data Access Latency in Large CPU Caches

S Sun, Y Zhu, X Ye, C Ding - … of the International Symposium on Memory …, 2024‏ - dl.acm.org
This paper describes a new, multi-locality benchmark program for testing memory access
latency and using it to study recent AMD machines equipped with 3D vertical cache (V …

MICPAT: Micro-architecture Independent Characteristics Profiling Analysis Tool for GPU Programs

W Peng, Q Kaiyuan, Y Zhibin, S Guangfeng, L Peng - 2024‏ - researchsquare.com
With the rapid evolution of GPU architectures, analyzing and optimizing the performance of
GPU rendering programs has become increasingly complex and crucial. To tackle the …