Colonnade: A reconfigurable SRAM-based digital bit-serial compute-in-memory macro for processing neural networks

H Kim, T Yoo, TTH Kim, B Kim - IEEE Journal of Solid-State …, 2021 - ieeexplore.ieee.org
This article (Colonnade) presents a fully digital bit-serial compute-in-memory (CIM) macro.
The digital CIM macro is designed for processing neural networks with reconfigurable 1-16 …

Hasco: Towards agile hardware and software co-design for tensor computation

Q **ao, S Zheng, B Wu, P Xu, X Qian… - 2021 ACM/IEEE 48th …, 2021 - ieeexplore.ieee.org
Tensor computations overwhelm traditional general-purpose computing devices due to the
large amounts of data and operations of the computations. They call for a holistic solution …

Reconfigurability, why it matters in AI tasks processing: A survey of reconfigurable AI chips

S Wei, X Lin, F Tu, Y Wang, L Liu… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Nowadays, artificial intelligence (AI) technologies, especially deep neural networks (DNNs),
play an vital role in solving many problems in both academia and industry. In order to …

FedQNN: A computation–communication-efficient federated learning framework for IoT with low-bitwidth neural network quantization

Y Ji, L Chen - IEEE Internet of Things Journal, 2022 - ieeexplore.ieee.org
Federated learning (FL) allows participants to train deep learning models collaboratively
without disclosing their data to the server or any other participants, providing excellent value …

SRAM-based in-memory computing macro featuring voltage-mode accumulator and row-by-row ADC for processing neural networks

J Mu, H Kim, B Kim - … Transactions on Circuits and Systems I …, 2022 - ieeexplore.ieee.org
This paper presents a mixed-signal SRAM-based in-memory computing (IMC) macro for
processing binarized neural networks. The IMC macro consists of (16K) SRAM-based …

Always-on 674μ W@ 4GOP/s error resilient binary neural networks with aggressive SRAM voltage scaling on a 22-nm IoT end-node

A Di Mauro, F Conti, PD Schiavone… - … on Circuits and …, 2020 - ieeexplore.ieee.org
Binary Neural Networks (BNNs) have been shown to be robust to random bit-level noise,
making aggressive voltage scaling attractive as a power-saving technique for both logic and …

High-performance spintronic nonvolatile ternary flip-flop and universal shift register

A Amirany, K Jafari, MH Moaiyeri - IEEE Transactions on Very …, 2021 - ieeexplore.ieee.org
Multiple-valued logic (MVL) shows considerable advantages over binary logic in certain
applications because of the increased informational content of its signals, and hence …

WRA: A 2.2-to-6.3 TOPS highly unified dynamically reconfigurable accelerator using a novel Winograd decomposition algorithm for convolutional neural networks

C Yang, Y Wang, X Wang… - IEEE Transactions on …, 2019 - ieeexplore.ieee.org
As convolutional neural networks (CNNs) become more and more diverse and complicated,
acceleration of CNNs increasingly encounters a bottleneck of balancing performance …

A 12.1 TOPS/W quantized network acceleration processor with effective-weight-based convolution and error-compensation-based prediction

H Mo, W Zhu, W Hu, Q Li, A Li, S Yin… - IEEE Journal of Solid …, 2021 - ieeexplore.ieee.org
In this article, a quantized network acceleration processor (QNAP) is proposed to efficiently
accelerate CNN processing by eliminating most unessential operations based on algorithm …

TIMAQ: A time-domain computing-in-memory-based processor using predictable decomposed convolution for arbitrary quantized DNNs

J Yang, Y Kong, Z Zhang, Z Liu, J Zhou… - IEEE Journal of Solid …, 2021 - ieeexplore.ieee.org
Energy-efficient processors are crucial for accelerating deep neural networks (DNNs) on
edge devices with limited battery capacity. To reduce energy consumption, time-domain …