An fpga-based transformer accelerator using output block stationary dataflow for object recognition applications
The transformer-based model has great potential to deliver higher accuracy for object
recognition applications when comparing it with the convolution neural network (CNN). Yet …
recognition applications when comparing it with the convolution neural network (CNN). Yet …
An Energy-Efficient Edge Processor for Radar-Based Continuous Fall Detection Utilizing Mixed-Radix FFT and Updated Block-Wise Computation
J Chen, K Lin, L Yang, W Ye - IEEE Internet of Things Journal, 2024 - ieeexplore.ieee.org
In the scenarios of the Internet of Things, fall detection holds increasing significance in the
health monitoring of elderly individuals. While most current research has achieved …
health monitoring of elderly individuals. While most current research has achieved …
Hardware-friendly logarithmic quantization with mixed-precision for mobilenetv2
In a variety of computer vision applications, convolutional neural networks (CNNs) have
achieved excellent accuracy. However, in order for a CNN to operate on embedded …
achieved excellent accuracy. However, in order for a CNN to operate on embedded …
FxP-QNet: a post-training quantizer for the design of mixed low-precision DNNs with dynamic fixed-point representation
Deep neural networks (DNNs) have demonstrated their effectiveness in a wide range of
computer vision tasks, with the state-of-the-art results obtained through complex and deep …
computer vision tasks, with the state-of-the-art results obtained through complex and deep …
DoubleQExt: Hardware and memory efficient CNN through two levels of quantization
To fulfil the tight area and memory constraints in IoT applications, the design of efficient
Convolutional Neural Network (CNN) hardware becomes crucial. Quantization of CNN is …
Convolutional Neural Network (CNN) hardware becomes crucial. Quantization of CNN is …
Energy-efficient high-speed ASIC implementation of convolutional neural network using novel reduced critical-path design
Convolutional Neural Network (CNN) plays an important role in several machine learning
tasks related to speech, image, and video processing applications. The increasing demand …
tasks related to speech, image, and video processing applications. The increasing demand …
[PDF][PDF] CNN Accelerator Using Proposed Diagonal Cyclic Array for Minimizing Memory Accesses.
HW Son, AA Al-Hamid, YS Na, DY Lee… - Computers, Materials & …, 2023 - researchgate.net
This paper presents the architecture of a Convolution Neural Network (CNN) accelerator
based on a new processing element (PE) array called a diagonal cyclic array (DCA). As …
based on a new processing element (PE) array called a diagonal cyclic array (DCA). As …
Design and implementation of an efficient CNN accelerator for low-cost FPGAs
Y Xu, S Wang, N Li, H **ao - IEICE Electronics Express, 2022 - jstage.jst.go.jp
This paper proposes a computation-array-centered dataflow, which adjusts the convolution
with different kernel sizes to a unified computing manner and reduces the dimension of …
with different kernel sizes to a unified computing manner and reduces the dimension of …
FLQ: Design and implementation of hybrid multi-base full logarithmic quantization neural network acceleration architecture based on FPGA
L Zhang, X Hu, X Liao, T Zhou, Y Peng - Signal Processing: Image …, 2025 - Elsevier
As deep neural network (DNN) models become more accurate, problems such as large
model parameters and high computational complexity have become increasingly prominent …
model parameters and high computational complexity have become increasingly prominent …
ASLog: An Area-Efficient CNN Accelerator for Per-Channel Logarithmic Post-Training Quantization
Post-training quantization (PTQ) has been proven an efficient model compression technique
for Convolution Neural Networks (CNNs), without re-training or access to labeled datasets …
for Convolution Neural Networks (CNNs), without re-training or access to labeled datasets …