A modern primer on processing in memory

O Mutlu, S Ghose, J Gómez-Luna… - … computing: from devices …, 2022 - Springer
Modern computing systems are overwhelmingly designed to move data to computation. This
design choice goes directly against at least three key trends in computing that cause …

Processing data where it makes sense: Enabling in-memory computation

O Mutlu, S Ghose, J Gómez-Luna… - Microprocessors and …, 2019 - Elsevier
Today's systems are overwhelmingly designed to move data to computation. This design
choice goes directly against at least three key trends in systems that cause performance …

Mosaic: a GPU memory manager with application-transparent support for multiple page sizes

R Ausavarungnirun, J Landgraf, V Miller… - Proceedings of the 50th …, 2017 - dl.acm.org
Contemporary discrete GPUs support rich memory management features such as virtual
memory and demand paging. These features simplify GPU programming by providing a …

Demystifying complex workload-DRAM interactions: An experimental study

S Ghose, T Li, N Ha**azar, DS Cali… - Proceedings of the ACM on …, 2019 - dl.acm.org
It has become increasingly difficult to understand the complex interactions between modern
applications and main memory, composed of Dynamic Random Access Memory (DRAM) …

Mask: Redesigning the gpu memory hierarchy to support multi-application concurrency

R Ausavarungnirun, V Miller, J Landgraf… - ACM SIGPLAN …, 2018 - dl.acm.org
Graphics Processing Units (GPUs) exploit large amounts of threadlevel parallelism to
provide high instruction throughput and to efficiently hide long-latency stalls. The resulting …

Utility-based hybrid memory management

Y Li, S Ghose, J Choi, J Sun, H Wang… - … Conference on Cluster …, 2017 - ieeexplore.ieee.org
While the memory footprints of cloud and HPC applications continue to increase,
fundamental issues with DRAM scaling are likely to prevent traditional main memory …

Understanding and improving the latency of DRAM-based memory systems

KK Chang - 2017 - search.proquest.com
Over the past two decades, the storage capacity and access bandwidth of main memory
have improved tremendously, by 128x and 20x, respectively. These improvements are …

Read disturbance in high bandwidth memory: A detailed experimental study on hbm2 dram chips

A Olgun, M Osseiran, AG Yağlıkçı… - 2024 54th Annual …, 2024 - ieeexplore.ieee.org
We experimentally demonstrate the effects of read disturbance (RowHammer and
RowPress) and uncover the inner workings of undocumented read disturbance defense …

Opportunistic computing in gpu architectures

A Pattnaik, X Tang, O Kayiran, A Jog, A Mishra… - Proceedings of the 46th …, 2019 - dl.acm.org
Data transfer overhead between computing cores and memory hierarchy has been a
persistent issue for von Neumann architectures and the problem has only become more …

Quality of service support for fine-grained sharing on GPUs

Z Wang, J Yang, R Melhem, B Childers… - Proceedings of the 44th …, 2017 - dl.acm.org
GPUs have been widely adopted in data centers to provide acceleration services to many
applications. Sharing a GPU is increasingly important for better processing throughput and …