MIMDRAM: An End-to-End Processing-Using-DRAM System for High-Throughput, Energy-Efficient and Programmer-Transparent Multiple-Instruction Multiple-Data …

GF Oliveira, A Olgun, AG Yağlıkçı… - … Symposium on High …, 2024 - ieeexplore.ieee.org
Processing-using-DRAM (PUD) is a processing-in-memory (PIM) approach that uses a
DRAM array's massive internal parallelism to execute very-wide (eg, 16,384-262,144-bit …

Simplepim: A software framework for productive and efficient processing-in-memory

J Chen, J Gómez-Luna, I El Hajj… - 2023 32nd …, 2023 - ieeexplore.ieee.org
Data movement between memory and processors is a major bottleneck in modern
computing systems. The processing-in-memory (PIM) paradigm aims to alleviate this …

PIM-Opt: Demystifying Distributed Optimization Algorithms on a Real-World Processing-In-Memory System

S Rhyner, H Luo, J Gómez-Luna… - Proceedings of the …, 2024 - dl.acm.org
Modern Machine Learning (ML) training on large-scale datasets is a very time-consuming
workload. It relies on the optimization algorithm Stochastic Gradient Descent (SGD) due to …

Simultaneous Many-Row Activation in Off-the-Shelf DRAM Chips: Experimental Characterization and Analysis

İE Yüksel, YC Tuğrul, FN Bostancı… - 2024 54th Annual …, 2024 - ieeexplore.ieee.org
We experimentally analyze the computational capability of commercial off-the-shelf (COTS)
DRAM chips and the robustness of these capabilities under various timing delays between …

SwiftRL: Towards Efficient Reinforcement Learning on Real Processing-In-Memory Systems

K Gogineni, SS Dayapule, J Gómez-Luna… - arxiv preprint arxiv …, 2024 - arxiv.org
Reinforcement Learning (RL) trains agents to learn optimal behavior by maximizing reward
signals from experience datasets. However, RL training often faces memory limitations …

Memory-Centric Computing: Recent Advances in Processing-in-DRAM

O Mutlu, A Olgun, GF Oliveira, IE Yuksel - arxiv preprint arxiv:2412.19275, 2024 - arxiv.org
Memory-centric computing aims to enable computation capability in and near all places
where data is generated and stored. As such, it can greatly reduce the large negative …

Analysis of Distributed Optimization Algorithms on a Real Processing-In-Memory System

S Rhyner, H Luo, J Gómez-Luna, M Sadrosadati… - arxiv preprint arxiv …, 2024 - arxiv.org
Machine Learning (ML) training on large-scale datasets is a very expensive and time-
consuming workload. Processor-centric architectures (eg, CPU, GPU) commonly used for …

UM-PIM: DRAM-based PIM with Uniform & Shared Memory Space

Y Zhao, M Gao, F Liu, Y Hu, Z Wang… - 2024 ACM/IEEE 51st …, 2024 - ieeexplore.ieee.org
DRAM-based Processing in Memory (PIM) addresses the “memory wall” problem by
incorporating computing units (PIM units) into main memory devices for faster and wider …

SPID-Join: A Skew-resistant Processing-in-DIMM Join Algorithm Exploiting the Bank-and Rank-level Parallelisms of DIMMs

S Lee, C Lim, J Choi, H Choi, C Lee, Y Park… - Proceedings of the …, 2024 - dl.acm.org
Recent advances in Dual In-line Memory Modules (DIMMs) allow DIMMs to support
Processing-In-DIMM (PID) by placing In-DIMM Processors (IDPs) near their memory banks …

PIM-MMU: A Memory Management Unit for Accelerating Data Transfers in Commercial PIM Systems

D Lee, B Hyun, T Kim, M Rhu - 2024 57th IEEE/ACM …, 2024 - ieeexplore.ieee.org
Processing-in-memory (PIM) has emerged as a promising solution for accelerating memory-
intensive workloads as they provide high memory bandwidth to the processing units. This …