Pqa: Exploring the potential of product quantization in dnn hardware acceleration
A Abouelhamayed, A Cui… - ACM Transactions on …, 2024 - dl.acm.org
Conventional multiply-accumulate (MAC) operations have long dominated computation time
for deep neural networks (DNNs), especially convolutional neural networks (CNNs) …
for deep neural networks (DNNs), especially convolutional neural networks (CNNs) …
Power-aware training for energy-efficient printed neuromorphic circuits
There is an increasing demand for next-generation flexible electronics in emerging low-cost
applications such as smart packaging and smart bandages, where conventional silicon …
applications such as smart packaging and smart bandages, where conventional silicon …
Stella Nera: Achieving 161 TOp/s/W with Multiplier-free DNN Acceleration based on Approximate Matrix Multiplication
From classical HPC to deep learning, MatMul is at the heart of today's computing. The recent
Maddness method approximates MatMul without the need for multiplication by using a hash …
Maddness method approximates MatMul without the need for multiplication by using a hash …
Multiplication-Free Lookup-Based CNN Accelerator using Residual Vector Quantization and Its FPGA Implementation
In this paper, a table lookup-based computing technique is proposed to perform
convolutional neural network (CNN) inference without multiplication, and its FPGA …
convolutional neural network (CNN) inference without multiplication, and its FPGA …
Full-stack optimization for cam-only dnn inference
The accuracy of neural networks has greatly improved across various domains over the past
years. Their ever-increasing complexity, however, leads to prohibitively high energy …
years. Their ever-increasing complexity, however, leads to prohibitively high energy …
LUT-DLA: Lookup Table as Efficient Extreme Low-Bit Deep Learning Accelerator
The emergence of neural network capabilities invariably leads to a significant surge in
computational demands due to expanding model sizes and increased computational …
computational demands due to expanding model sizes and increased computational …
LUTIN: Efficient Neural Network Inference with Table Lookup
DNN models are becoming increasingly large and complex, but they are also being
deployed on commodity devices that require low power and latency but lack specialized …
deployed on commodity devices that require low power and latency but lack specialized …