Maeri: Enabling flexible dataflow map** over dnn accelerators via reconfigurable interconnects

H Kwon, A Samajdar, T Krishna - ACM Sigplan Notices, 2018 - dl.acm.org
Deep neural networks (DNN) have demonstrated highly promising results across computer
vision and speech recognition, and are becoming foundational for ubiquitous AI. The …

ReRAM-based processing-in-memory architecture for recurrent neural network acceleration

Y Long, T Na, S Mukhopadhyay - IEEE Transactions on Very …, 2018 - ieeexplore.ieee.org
We present a recurrent neural network (RNN) accelerator design with resistive random-
access memory (ReRAM)-based processing-in-memory (PIM) architecture. Distinguished …

FlexBlock: A flexible DNN training accelerator with multi-mode block floating point support

SH Noh, J Koo, S Lee, J Park… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
When training deep neural networks (DNNs), expensive floating point arithmetic units are
used in GPUs or custom neural processing units (NPUs). To reduce the burden of floating …

Approximate computing for long short term memory (LSTM) neural networks

S Sen, A Raghunathan - IEEE Transactions on Computer-Aided …, 2018 - ieeexplore.ieee.org
Long Short Term Memory (LSTM) networks are a class of recurrent neural networks that are
widely used for machine learning tasks involving sequences, including machine translation …

[HTML][HTML] Approximate LSTM computing for energy-efficient speech recognition

J Jo, J Kung, Y Lee - Electronics, 2020 - mdpi.com
This paper presents an approximate computing method of long short-term memory (LSTM)
operations for energy-efficient end-to-end speech recognition. We newly introduce the …

A power-aware digital multilayer perceptron accelerator with on-chip training based on approximate computing

D Kim, J Kung, S Mukhopadhyay - IEEE Transactions on …, 2017 - ieeexplore.ieee.org
This paper proposes that approximation by reducing bit-precision and using inexact
multiplier can save power consumption of digital multilayer perceptron accelerator during …

Adaptive weight compression for memory-efficient neural networks

JH Ko, D Kim, T Na, J Kung… - Design, Automation & …, 2017 - ieeexplore.ieee.org
Neural networks generally require significant memory capacity/bandwidth to store/access a
large number of synaptic weights. This paper presents an application of JPEG image …

Memory-reduced network stacking for edge-level CNN architecture with structured weight pruning

S Moon, Y Byun, J Park, S Lee… - IEEE Journal on Emerging …, 2019 - ieeexplore.ieee.org
This paper presents a novel stacking and multi-level indexing scheme for convolutional
neural networks (CNNs) used in energy-limited edge-level systems. Basically, the proposed …

Genesys: Enabling continuous learning through neural network evolution in hardware

A Samajdar, P Mannan, K Garg… - 2018 51st Annual IEEE …, 2018 - ieeexplore.ieee.org
Modern deep learning systems rely on (a) a hand-tuned neural network topology,(b)
massive amounts of labeled training data, and (c) extensive training over large-scale …

Design and analysis of a neural network inference engine based on adaptive weight compression

JH Ko, D Kim, T Na… - IEEE Transactions on …, 2018 - ieeexplore.ieee.org
Neural networks generally require significant memory capacity/bandwidth to store/access a
large number of synaptic weights. This paper presents design of an energy-efficient neural …