MIMDRAM: An End-to-End Processing-Using-DRAM System for High-Throughput, Energy-Efficient and Programmer-Transparent Multiple-Instruction Multiple-Data …
Processing-using-DRAM (PUD) is a processing-in-memory (PIM) approach that uses a
DRAM array's massive internal parallelism to execute very-wide (eg, 16,384-262,144-bit …
DRAM array's massive internal parallelism to execute very-wide (eg, 16,384-262,144-bit …
Simplepim: A software framework for productive and efficient processing-in-memory
Data movement between memory and processors is a major bottleneck in modern
computing systems. The processing-in-memory (PIM) paradigm aims to alleviate this …
computing systems. The processing-in-memory (PIM) paradigm aims to alleviate this …
PIM-Opt: Demystifying Distributed Optimization Algorithms on a Real-World Processing-In-Memory System
S Rhyner, H Luo, J Gómez-Luna… - Proceedings of the …, 2024 - dl.acm.org
Modern Machine Learning (ML) training on large-scale datasets is a very time-consuming
workload. It relies on the optimization algorithm Stochastic Gradient Descent (SGD) due to …
workload. It relies on the optimization algorithm Stochastic Gradient Descent (SGD) due to …
Simultaneous Many-Row Activation in Off-the-Shelf DRAM Chips: Experimental Characterization and Analysis
We experimentally analyze the computational capability of commercial off-the-shelf (COTS)
DRAM chips and the robustness of these capabilities under various timing delays between …
DRAM chips and the robustness of these capabilities under various timing delays between …
SwiftRL: Towards Efficient Reinforcement Learning on Real Processing-In-Memory Systems
Reinforcement Learning (RL) trains agents to learn optimal behavior by maximizing reward
signals from experience datasets. However, RL training often faces memory limitations …
signals from experience datasets. However, RL training often faces memory limitations …
Memory-Centric Computing: Recent Advances in Processing-in-DRAM
Memory-centric computing aims to enable computation capability in and near all places
where data is generated and stored. As such, it can greatly reduce the large negative …
where data is generated and stored. As such, it can greatly reduce the large negative …
Analysis of Distributed Optimization Algorithms on a Real Processing-In-Memory System
Machine Learning (ML) training on large-scale datasets is a very expensive and time-
consuming workload. Processor-centric architectures (eg, CPU, GPU) commonly used for …
consuming workload. Processor-centric architectures (eg, CPU, GPU) commonly used for …
UM-PIM: DRAM-based PIM with Uniform & Shared Memory Space
DRAM-based Processing in Memory (PIM) addresses the “memory wall” problem by
incorporating computing units (PIM units) into main memory devices for faster and wider …
incorporating computing units (PIM units) into main memory devices for faster and wider …
SPID-Join: A Skew-resistant Processing-in-DIMM Join Algorithm Exploiting the Bank-and Rank-level Parallelisms of DIMMs
Recent advances in Dual In-line Memory Modules (DIMMs) allow DIMMs to support
Processing-In-DIMM (PID) by placing In-DIMM Processors (IDPs) near their memory banks …
Processing-In-DIMM (PID) by placing In-DIMM Processors (IDPs) near their memory banks …
PIM-MMU: A Memory Management Unit for Accelerating Data Transfers in Commercial PIM Systems
Processing-in-memory (PIM) has emerged as a promising solution for accelerating memory-
intensive workloads as they provide high memory bandwidth to the processing units. This …
intensive workloads as they provide high memory bandwidth to the processing units. This …