Pd-quant: Post-training quantization based on prediction difference metric
Post-training quantization (PTQ) is a neural network compression technique that converts a
full-precision model into a quantized model using lower-precision data types. Although it can …
full-precision model into a quantized model using lower-precision data types. Although it can …
Adaptive data-free quantization
Data-free quantization (DFQ) recovers the performance of quantized network (Q) without the
original data, but generates the fake sample via a generator (G) by learning from full …
original data, but generates the fake sample via a generator (G) by learning from full …
Hard sample matters a lot in zero-shot quantization
H Li, X Wu, F Lv, D Liao, TH Li… - Proceedings of the …, 2023 - openaccess.thecvf.com
Zero-shot quantization (ZSQ) is promising for compressing and accelerating deep neural
networks when the data for training full-precision models are inaccessible. In ZSQ, network …
networks when the data for training full-precision models are inaccessible. In ZSQ, network …
A survey of model compression strategies for object detection
Z Lyu, T Yu, F Pan, Y Zhang, J Luo, D Zhang… - Multimedia tools and …, 2024 - Springer
Deep neural networks (DNNs) have achieved great success in many object detection tasks.
However, such DNNS-based large object detection models are generally computationally …
However, such DNNS-based large object detection models are generally computationally …
Rethinking data-free quantization as a zero-sum game
Data-free quantization (DFQ) recovers the performance of quantized network (Q) without
accessing the real data, but generates the fake sample via a generator (G) by learning from …
accessing the real data, but generates the fake sample via a generator (G) by learning from …
Data Generation for Hardware-Friendly Post-Training Quantization
Zero-shot quantization (ZSQ) using synthetic data is a key approach for post-training
quantization (PTQ) under privacy and security constraints. However, existing data …
quantization (PTQ) under privacy and security constraints. However, existing data …
TexQ: zero-shot network quantization with texture feature distribution calibration
Quantization is an effective way to compress neural networks. By reducing the bit width of
the parameters, the processing efficiency of neural network models at edge devices can be …
the parameters, the processing efficiency of neural network models at edge devices can be …
Data-Free Quantization via Pseudo-label Filtering
Quantization for model compression can efficiently reduce the network complexity and
storage requirement but the original training data is necessary to remedy the performance …
storage requirement but the original training data is necessary to remedy the performance …