SIMDRAM: A framework for bit-serial SIMD processing using DRAM

N Ha**azar, GF Oliveira, S Gregorio… - Proceedings of the 26th …, 2021 - dl.acm.org
Processing-using-DRAM has been proposed for a limited set of basic operations (ie, logic
operations, addition). However, in order to enable full adoption of processing-using-DRAM …

DAMOV: A new methodology and benchmark suite for evaluating data movement bottlenecks

GF Oliveira, J Gómez-Luna, L Orosa, S Ghose… - IEEE …, 2021 - ieeexplore.ieee.org
Data movement between the CPU and main memory is a first-order obstacle against improv
ing performance, scalability, and energy efficiency in modern systems. Computer systems …

MIMDRAM: An end-to-end processing-using-DRAM system for high-throughput, energy-efficient and programmer-transparent multiple-instruction multiple-data …

GF Oliveira, A Olgun, AG Yağlıkçı… - … Symposium on High …, 2024 - ieeexplore.ieee.org
Processing-using-DRAM (PUD) is a processing-in-memory (PIM) approach that uses a
DRAM array's massive internal parallelism to execute very-wide (eg, 16,384-262,144-bit …

NDPBridge: Enabling Cross-Bank Coordination in Near-DRAM-Bank Processing Architectures

B Tian, Y Li, L Jiang, S Cai… - 2024 ACM/IEEE 51st …, 2024 - ieeexplore.ieee.org
Various near-data processing (NDP) designs have been proposed to alleviate the memory
wall challenge for data-intensive applications. Among them, near-DRAM-bank NDP …

Sieve: Scalable in-situ DRAM-based accelerator designs for massively parallel k-mer matching

L Wu, R Sharifi, M Lenjani, K Skadron… - 2021 ACM/IEEE 48th …, 2021 - ieeexplore.ieee.org
The rapid influx of biosequence data, coupled with the stagnation of the processing power of
modern computing systems, highlights the critical need for exploring high-performance …

Chopper: A compiler infrastructure for programmable bit-serial simd processing using memory in dram

X Peng, Y Wang, MC Yang - 2023 IEEE International …, 2023 - ieeexplore.ieee.org
Increasing interests in Bit-serial SIMD Processing-Using-DRAM (PUD) architectures amplify
the needs for a compiler to automate code generation, credited to their ultra-wide SIMD …

Gearbox: A case for supporting accumulation dispatching and hybrid partitioning in PIM-based accelerators

M Lenjani, A Ahmed, M Stan, K Skadron - Proceedings of the 49th …, 2022 - dl.acm.org
Processing-in-memory (PIM) minimizes data movement overheads by placing processing
units near each memory segment. Recent PIMs employ processing units with a SIMD …

Accelerating database analytic query workloads using an associative processor

H Caminal, Y Chronis, T Wu, JM Patel… - Proceedings of the 49th …, 2022 - dl.acm.org
Database analytic query workloads are heavy consumers of data-center cycles, and there is
constant demand to improve their performance. Associative processors (AP) have re …

Impala: Algorithm/architecture co-design for in-memory multi-stride pattern matching

E Sadredini, R Rahimi, M Lenjani… - … symposium on high …, 2020 - ieeexplore.ieee.org
High-throughput and concurrent processing of thousands of patterns on each byte of an
input stream is critical for many applications with real-time processing needs, such as …

Sal-pim: A subarray-level processing-in-memory architecture with lut-based linear interpolation for transformer-based text generation

W Han, H Cho, D Kim, JY Kim - arxiv preprint arxiv:2401.17005, 2024 - arxiv.org
Text generation is a compelling sub-field of natural language processing, aiming to generate
human-readable text from input words. In particular, the decoder-only generative models …