Towards efficient in-memory computing hardware for quantized neural networks: state-of-the-art, open challenges and perspectives

O Krestinskaya, L Zhang… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
The amount of data processed in the cloud, the development of Internet-of-Things (IoT)
applications, and growing data privacy concerns force the transition from cloud-based to …

Zero-centered fixed-point quantization with iterative retraining for deep convolutional neural network-based object detectors

S Kim, H Kim - IEEE Access, 2021 - ieeexplore.ieee.org
In the field of object detection, deep learning has greatly improved accuracy compared to
previous algorithms and has been used widely in recent years. However, object detection …

Survey of CPU and memory simulators in computer architecture: A comprehensive analysis including compiler integration and emerging technology applications

I Hwang, J Lee, H Kang, G Lee, H Kim - Simulation Modelling Practice and …, 2024 - Elsevier
In computer architecture studies, simulators are crucial for design verification, reducing
research and development time and ensuring the high accuracy of verification results …

Performance comparison of CNN, QNN and BNN deep neural networks for real-time object detection using ZYNQ FPGA node

VRS Mani, A Saravanaselvan, N Arumugam - Microelectronics Journal, 2022 - Elsevier
In this manuscript, previously trained Convolutional neural network (CNN), Quantum Neural
Network (QNN), and Binarized Neural Network (BNN) models performed employing Tensor …

Improving extreme low-bit quantization with soft threshold

W Xu, F Li, Y Jiang, A Yong, X He… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Deep neural networks executing with low precision at inference time can gain acceleration
and compression advantages over their high-precision counterparts, but need to overcome …

Tinypillarnet: Tiny pillar-based network for 3d point cloud object detection at edge

Y Li, Y Zhang, R Lai - … Transactions on Circuits and Systems for …, 2023 - ieeexplore.ieee.org
Limited by huge computational cost, high inference latency and large memory consumption,
existing 3D point cloud object detection methods are hard to be deployed on Internet of …

FPGA-based vehicle detection and tracking accelerator

J Zhai, B Li, S Lv, Q Zhou - Sensors, 2023 - mdpi.com
A convolutional neural network-based multiobject detection and tracking algorithm can be
applied to vehicle detection and traffic flow statistics, thus enabling smart transportation …

Real-time SSDLite object detection on FPGA

S Kim, S Na, BY Kong, J Choi… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
Deep neural network (DNN)-based object detection has been investigated and applied to
various real-time applications. However, it is hard to employ the DNNs in embedded …

A 109-gops/w fpga-based vision transformer accelerator with weight-loop dataflow featuring data reusing and resource saving

Y Zhang, L Feng, H Shan, Z Zhu - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
The Vision Transformer (ViT) models have demonstrated excellent performance in computer
vision tasks, but a large amount of computation and memory access for massive matrix …

Software-hardware co-design for accelerating large-scale graph convolutional network inference on FPGA

S Ran, B Zhao, X Dai, C Cheng, Y Zhang - Neurocomputing, 2023 - Elsevier
Inspired by convolutional neural networks, graph convolutional networks (GCNs) have been
proposed for processing non-Euclidean graph data and successfully been applied in …