A review of near-memory computing architectures: Opportunities and challenges

G Singh, L Chelini, S Corda, AJ Awan… - 2018 21st Euromicro …, 2018‏ - ieeexplore.ieee.org
The conventional approach of moving stored data to the CPU for computation has become a
major performance bottleneck for emerging scale-out data-intensive applications due to their …

Benchmarking a new paradigm: Experimental analysis and characterization of a real processing-in-memory system

J Gómez-Luna, I El Hajj, I Fernandez… - IEEE …, 2022‏ - ieeexplore.ieee.org
Many modern workloads, such as neural networks, databases, and graph processing, are
fundamentally memory-bound. For such workloads, the data movement between main …

A modern primer on processing in memory

O Mutlu, S Ghose, J Gómez-Luna… - … computing: from devices …, 2022‏ - Springer
Modern computing systems are overwhelmingly designed to move data to computation. This
design choice goes directly against at least three key trends in computing that cause …

Ambit: In-memory accelerator for bulk bitwise operations using commodity DRAM technology

V Seshadri, D Lee, T Mullins, H Hassan… - Proceedings of the 50th …, 2017‏ - dl.acm.org
Many important applications trigger bulk bitwise operations, ie, bitwise operations on large
bit vectors. In fact, recent works design techniques that exploit fast bulk bitwise operations to …

Google workloads for consumer devices: Mitigating data movement bottlenecks

A Boroumand, S Ghose, Y Kim… - Proceedings of the …, 2018‏ - dl.acm.org
We are experiencing an explosive growth in the number of consumer devices, including
smartphones, tablets, web-based computers such as Chromebooks, and wearable devices …

Drisa: A dram-based reconfigurable in-situ accelerator

S Li, D Niu, KT Malladi, H Zheng, B Brennan… - Proceedings of the 50th …, 2017‏ - dl.acm.org
Data movement between the processing units and the memory in traditional von Neumann
architecture is creating the" memory wall" problem. To bridge the gap, two approaches, the …

Rowhammer: A retrospective

O Mutlu, JS Kim - … Transactions on Computer-Aided Design of …, 2019‏ - ieeexplore.ieee.org
This retrospective paper describes the RowHammer problem in dynamic random access
memory (DRAM), which was initially introduced by Kim et al. at the ISCA 2014 Conference …

Processing data where it makes sense: Enabling in-memory computation

O Mutlu, S Ghose, J Gómez-Luna… - Microprocessors and …, 2019‏ - Elsevier
Today's systems are overwhelmingly designed to move data to computation. This design
choice goes directly against at least three key trends in systems that cause performance …

Recnmp: Accelerating personalized recommendation with near-memory processing

L Ke, U Gupta, BY Cho, D Brooks… - 2020 ACM/IEEE 47th …, 2020‏ - ieeexplore.ieee.org
Personalized recommendation systems leverage deep learning models and account for the
majority of data center AI cycles. Their performance is dominated by memory-bound sparse …

Scalpel: Customizing dnn pruning to the underlying hardware parallelism

J Yu, A Lukefahr, D Palframan, G Dasika… - ACM SIGARCH …, 2017‏ - dl.acm.org
As the size of Deep Neural Networks (DNNs) continues to grow to increase accuracy and
solve more complex problems, their energy footprint also scales. Weight pruning reduces …