DeepBurning-MixQ: An open source mixed-precision neural network accelerator design framework for FPGAs

E Luo, H Huang, C Liu, G Li, B Yang… - 2023 IEEE/ACM …, 2023 - ieeexplore.ieee.org
Mixed-precision neural networks (MPNNs) that enable the use of just enough data width for
a deep learning task promise significant advantages of both inference accuracy and …

Algorithm/accelerator co-design and co-search for edge ai

X Zhang, Y Li, J Pan, D Chen - IEEE Transactions on Circuits …, 2022 - ieeexplore.ieee.org
The world has seen the great success of deep neural networks (DNNs) in a massive number
of artificial intelligence (AI) applications. However, develo** high-quality AI services to …

Msd: Mixing signed digit representations for hardware-efficient dnn acceleration on fpga with heterogeneous resources

J Wu, J Zhou, Y Gao, Y Ding, N Wong… - 2023 IEEE 31st …, 2023 - ieeexplore.ieee.org
By quantizing weights with different precision for different parts of a network, mixed-precision
quantization promises to reduce the hardware cost and improve the speed of deep neural …

Uint-packing: Multiply your dnn accelerator performance via unsigned integer dsp packing

J Zhang, M Zhang, X Cao, G Li - 2023 60th ACM/IEEE Design …, 2023 - ieeexplore.ieee.org
DSP blocks are undoubtedly efficient solutions for implementing multiply-accumulate (MAC)
operations on FPGA. Since DSP resources are scarce in FPGA, the advanced solution is to …

A comprehensive analysis of DAC-SDC FPGA low power object detection challenge

J Zhang, G Li, M Zhang, X Cao, Y Zhang, X Li… - Science China …, 2024 - Springer
The lower power object detection challenge (LPODC) at the IEEE/ACM Design Automation
Conference is a premier contest in low-power object detection and algorithm (software) …

SDA: Low-Bit Stable Diffusion Acceleration on Edge FPGAs

G Yang, Y **e, ZJ Xue, SE Chang, Y Li… - … Conference on Field …, 2024 - ieeexplore.ieee.org
This paper introduces SDA, the first effort to adapt the expensive stable diffusion (SD) model
for edge FPGA deployment. First, we apply quantization-aware training to quantize its …

Sensitivity-aware mixed-precision quantization and width optimization of deep neural networks through cluster-based tree-structured Parzen estimation

S Azizi, M Nazemi, A Fayyazi, M Pedram - arxiv preprint arxiv:2308.06422, 2023 - arxiv.org
As the complexity and computational demands of deep learning models rise, the need for
effective optimization methods for neural network designs becomes paramount. This work …

TATAA: Programmable Mixed-Precision Transformer Acceleration with a Transformable Arithmetic Architecture

J Wu, M Song, J Zhao, Y Gao, J Li… - ACM Transactions on …, 2024 - dl.acm.org
Modern transformer-based deep neural networks present unique technical challenges for
effective acceleration in real-world applications. Apart from the vast amount of linear …

SA4: A Comprehensive Analysis and Optimization of Systolic Array Architecture for 4-bit Convolutions

G Yang, J Lei, Z Fang, J Zhang… - … Conference on Field …, 2024 - ieeexplore.ieee.org
Many studies have demonstrated that 4-bit precision quantization can maintain accuracy
levels comparable to those of floating-point deep neural networks (DNNs). Thus, it has …

MCU-MixQ: A HW/SW Co-optimized Mixed-precision Neural Network Design Framework for MCUs

J Gong, C Liu, L Cheng, H Li, X Li - arxiv preprint arxiv:2407.18267, 2024 - arxiv.org
Mixed-precision neural network (MPNN) that utilizes just enough data width for the neural
network processing is an effective approach to meet the stringent resources constraints …