An introduction to the compute express link (cxl) interconnect

D Das Sharma, R Blankenship, D Berger - ACM Computing Surveys, 2024 - dl.acm.org
The Compute Express Link (CXL) is an open industry-standard interconnect between
processors and devices such as accelerators, memory buffers, smart network interfaces …

TinyML: Current progress, research challenges, and future roadmap

M Shafique, T Theocharides, VJ Reddy… - 2021 58th ACM/IEEE …, 2021 - ieeexplore.ieee.org
TinyML: Current Progress, Research Challenges, and Future Roadmap Page 1 TinyML:
Current Progress, Research Challenges, and Future Roadmap Muhammad Shafique New …

Neural inference at the frontier of energy, space, and time

DS Modha, F Akopyan, A Andreopoulos… - Science, 2023 - science.org
Computing, since its inception, has been processor-centric, with memory separated from
compute. Inspired by the organic brain and optimized for inorganic silicon, NorthPole is a …

Benchmarking a new paradigm: Experimental analysis and characterization of a real processing-in-memory system

J Gómez-Luna, I El Hajj, I Fernandez… - IEEE …, 2022 - ieeexplore.ieee.org
Many modern workloads, such as neural networks, databases, and graph processing, are
fundamentally memory-bound. For such workloads, the data movement between main …

SIMDRAM: A framework for bit-serial SIMD processing using DRAM

N Ha**azar, GF Oliveira, S Gregorio… - Proceedings of the 26th …, 2021 - dl.acm.org
Processing-using-DRAM has been proposed for a limited set of basic operations (ie, logic
operations, addition). However, in order to enable full adoption of processing-using-DRAM …

Design principles for lifelong learning AI accelerators

D Kudithipudi, A Daram, AM Zyarah, FT Zohora… - Nature …, 2023 - nature.com
Lifelong learning—an agent's ability to learn throughout its lifetime—is a hallmark of
biological learning systems and a central challenge for artificial intelligence (AI). The …

CHARM: C omposing H eterogeneous A ccele R ators for M atrix Multiply on Versal ACAP Architecture

J Zhuang, J Lau, H Ye, Z Yang, Y Du, J Lo… - Proceedings of the …, 2023 - dl.acm.org
Dense matrix multiply (MM) serves as one of the most heavily used kernels in deep learning
applications. To cope with the high computation demands of these applications …

A Review on the emerging technology of TinyML

V Tsoukas, A Gkogkidis, E Boumpa… - ACM Computing …, 2024 - dl.acm.org
Tiny Machine Learning (TinyML) is an emerging technology proposed by the scientific
community for develo** autonomous and secure devices that can gather, process, and …

Blockhammer: Preventing rowhammer at low cost by blacklisting rapidly-accessed dram rows

AG Yağlikçi, M Patel, JS Kim, R Azizi… - … Symposium on High …, 2021 - ieeexplore.ieee.org
Aggressive memory density scaling causes modern DRAM devices to suffer from
RowHammer, a phenomenon where rapidly activating (ie, hammering) a DRAM row can …

DAMOV: A new methodology and benchmark suite for evaluating data movement bottlenecks

GF Oliveira, J Gómez-Luna, L Orosa, S Ghose… - IEEE …, 2021 - ieeexplore.ieee.org
Data movement between the CPU and main memory is a first-order obstacle against improv
ing performance, scalability, and energy efficiency in modern systems. Computer systems …