Repq-vit: Scale reparameterization for post-training quantization of vision transformers

Z Li, J **ao, L Yang, Q Gu - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com
Abstract Post-training quantization (PTQ), which only requires a tiny dataset for calibration
without end-to-end retraining, is a light and practical model compression technique …

I-vit: Integer-only quantization for efficient vision transformer inference

Z Li, Q Gu - Proceedings of the IEEE/CVF International …, 2023 - openaccess.thecvf.com
Abstract Vision Transformers (ViTs) have achieved state-of-the-art performance on various
computer vision applications. However, these models have considerable storage and …

Unified data-free compression: Pruning and quantization without fine-tuning

S Bai, J Chen, X Shen, Y Qian… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Structured pruning and quantization are promising approaches for reducing the inference
time and memory footprint of neural networks. However, most existing methods require the …

Towards feature distribution alignment and diversity enhancement for data-free quantization

Y Gao, Z Zhang, R Hong, H Zhang… - … Conference on Data …, 2022 - ieeexplore.ieee.org
To obtain lower inference latency and less memory footprint of deep neural networks, model
quantization has been widely employed in deep model deployment, by converting the …

Data-free neural representation compression with riemannian neural dynamics

Z Pei, A Zhang, S Wang, X Ji… - Forty-first International …, 2024 - openreview.net
Neural models are equivalent to dynamic systems from a physics-inspired view, implying
that computation on neural networks can be interpreted as the dynamical interactions …

Privacy-Preserving SAM Quantization for Efficient Edge Intelligence in Healthcare

Z Li, J Zhang, Q Gu - arxiv preprint arxiv:2410.01813, 2024 - arxiv.org
The disparity in healthcare personnel expertise and medical resources across different
regions of the world is a pressing social issue. Artificial intelligence technology offers new …

EDA-DM: Enhanced Distribution Alignment for Post-Training Quantization of Diffusion Models

X Liu, Z Li, J **ao, Q Gu - arxiv preprint arxiv:2401.04585, 2024 - arxiv.org
Diffusion models have achieved great success in image generation tasks through iterative
noise estimation. However, the heavy denoising process and complex neural networks …

[NAVOD][C] 面向 Transformer 模型边缘端部署的常用激活函数高精度轻量级量化推理方法

杨赟辉, 程虎, 魏敬和, 刘国柱, 桑贤侦 - 电子学报, 2024