Make repvgg greater again: A quantization-aware approach

X Chu, L Li, B Zhang - Proceedings of the AAAI Conference on Artificial …, 2024 - ojs.aaai.org
The tradeoff between performance and inference speed is critical for practical applications.
Architecture reparameterization obtains better tradeoffs and it is becoming an increasingly …

Eq-net: Elastic quantization neural networks

K Xu, L Han, Y Tian, S Yang… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Current model quantization methods have shown their promising capability in reducing
storage space and computation complexity. However, due to the diversity of quantization …

Fast search of face recognition model for a mobile device based on neural architecture comparator

AV Savchenko, LV Savchenko, I Makarov - IEEE Access, 2023 - ieeexplore.ieee.org
This paper addresses the face recognition task for offline mobile applications. Using AutoML
techniques, a novel technological framework is proposed to develop a fast neural network …

TexQ: zero-shot network quantization with texture feature distribution calibration

X Chen, Y Wang, R Yan, Y Liu… - Advances in Neural …, 2024 - proceedings.neurips.cc
Quantization is an effective way to compress neural networks. By reducing the bit width of
the parameters, the processing efficiency of neural network models at edge devices can be …

Design automation for fast, lightweight, and effective deep learning models: A survey

D Zhang, K Chen, Y Zhao, B Yang, L Yao… - arxiv preprint arxiv …, 2022 - arxiv.org
Deep learning technologies have demonstrated remarkable effectiveness in a wide range of
tasks, and deep learning holds the potential to advance a multitude of applications …

Bitwidth-adaptive quantization-aware neural network training: A meta-learning approach

J Youn, J Song, HS Kim, S Bahk - European Conference on Computer …, 2022 - Springer
Deep neural network quantization with adaptive bitwidths has gained increasing attention
due to the ease of model deployment on various platforms with different resource budgets. In …

Mix-gemm: An efficient hw-sw architecture for mixed-precision quantized deep neural networks inference on edge devices

E Reggiani, A Pappalardo, M Doblas… - … Symposium on High …, 2023 - ieeexplore.ieee.org
Deep Neural Network (DNN) inference based on quantized narrow-precision integer data
represents a promising research direction toward efficient deep learning computations on …

Mixpath: A unified approach for one-shot neural architecture search

X Chu, S Lu, X Li, B Zhang - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com
Blending multiple convolutional kernels is proved advantageous in neural architecture
design. However, current two-stage neural architecture search methods are mainly limited to …

QuantNAS: Quantization-aware Neural Architecture Search For Efficient Deployment On Mobile Device

T Gao, L Guo, S Zhao, P Xu, Y Yang… - Proceedings of the …, 2024 - openaccess.thecvf.com
Deep convolutional networks are increasingly applied in mobile AI scenarios. To achieve
efficient deployment researchers combine neural architecture search (NAS) and …

Spaceevo: Hardware-friendly search space design for efficient int8 inference

X Wang, LL Zhang, J Xu, Q Zhang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Abstract The combination of Neural Architecture Search (NAS) and quantization has proven
successful in automatically designing low-FLOPs INT8 quantized neural networks (QNN) …