Pd-quant: Post-training quantization based on prediction difference metric

J Liu, L Niu, Z Yuan, D Yang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Post-training quantization (PTQ) is a neural network compression technique that converts a
full-precision model into a quantized model using lower-precision data types. Although it can …

Adaptive data-free quantization

B Qian, Y Wang, R Hong… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Data-free quantization (DFQ) recovers the performance of quantized network (Q) without the
original data, but generates the fake sample via a generator (G) by learning from full …

Hard sample matters a lot in zero-shot quantization

H Li, X Wu, F Lv, D Liao, TH Li… - Proceedings of the …, 2023 - openaccess.thecvf.com
Zero-shot quantization (ZSQ) is promising for compressing and accelerating deep neural
networks when the data for training full-precision models are inaccessible. In ZSQ, network …

A survey of model compression strategies for object detection

Z Lyu, T Yu, F Pan, Y Zhang, J Luo, D Zhang… - Multimedia tools and …, 2024 - Springer
Deep neural networks (DNNs) have achieved great success in many object detection tasks.
However, such DNNS-based large object detection models are generally computationally …

Rethinking data-free quantization as a zero-sum game

B Qian, Y Wang, R Hong, M Wang - … of the AAAI conference on artificial …, 2023 - ojs.aaai.org
Data-free quantization (DFQ) recovers the performance of quantized network (Q) without
accessing the real data, but generates the fake sample via a generator (G) by learning from …

Zero-shot sharpness-aware quantization for pre-trained language models

M Zhu, Q Zhong, L Shen, L Ding, J Liu, B Du… - ar** lightweight deep neural
networks when data is inaccessible owing to various reasons, including cost and issues …

Data Generation for Hardware-Friendly Post-Training Quantization

L Dikstein, A Lapid, A Netzer, HV Habi - arxiv preprint arxiv:2410.22110, 2024 - arxiv.org
Zero-shot quantization (ZSQ) using synthetic data is a key approach for post-training
quantization (PTQ) under privacy and security constraints. However, existing data …

TexQ: zero-shot network quantization with texture feature distribution calibration

X Chen, Y Wang, R Yan, Y Liu… - Advances in Neural …, 2024 - proceedings.neurips.cc
Quantization is an effective way to compress neural networks. By reducing the bit width of
the parameters, the processing efficiency of neural network models at edge devices can be …

Data-Free Quantization via Pseudo-label Filtering

C Fan, Z Wang, D Guo, M Wang - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Quantization for model compression can efficiently reduce the network complexity and
storage requirement but the original training data is necessary to remedy the performance …