Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Q-diffusion: Quantizing diffusion models
Diffusion models have achieved great success in image synthesis through iterative noise
estimation using deep neural networks. However, the slow inference, high memory …
estimation using deep neural networks. However, the slow inference, high memory …
Olive: Accelerating large language models via hardware-friendly outlier-victim pair quantization
Transformer-based large language models (LLMs) have achieved great success with the
growing model size. LLMs' size grows by 240× every two years, which outpaces the …
growing model size. LLMs' size grows by 240× every two years, which outpaces the …
Outlier suppression+: Accurate quantization of large language models by equivalent and optimal shifting and scaling
Post-training quantization~(PTQ) of transformer language models faces significant
challenges due to the existence of detrimental outliers in activations. We observe that these …
challenges due to the existence of detrimental outliers in activations. We observe that these …
Compressing large language models by joint sparsification and quantization
In this paper, we introduce a novel model compression technique named Joint Sparsification
and Quantization (JSQ), explicitly tailored for large language models (LLMs). Traditional …
and Quantization (JSQ), explicitly tailored for large language models (LLMs). Traditional …
Noisyquant: Noisy bias-enhanced post-training activation quantization for vision transformers
The complicated architecture and high training cost of vision transformers urge the
exploration of post-training quantization. However, the heavy-tailed distribution of vision …
exploration of post-training quantization. However, the heavy-tailed distribution of vision …
Intraq: Learning synthetic images with intra-class heterogeneity for zero-shot network quantization
Learning to synthesize data has emerged as a promising direction in zero-shot quantization
(ZSQ), which represents neural networks by low-bit integer without accessing any of the real …
(ZSQ), which represents neural networks by low-bit integer without accessing any of the real …
Ant: Exploiting adaptive numerical data type for low-bit deep neural network quantization
Quantization is a technique to reduce the computation and memory cost of DNN models,
which are getting increasingly large. Existing quantization solutions use fixed-point integer …
which are getting increasingly large. Existing quantization solutions use fixed-point integer …
Hard sample matters a lot in zero-shot quantization
H Li, X Wu, F Lv, D Liao, TH Li… - Proceedings of the …, 2023 - openaccess.thecvf.com
Zero-shot quantization (ZSQ) is promising for compressing and accelerating deep neural
networks when the data for training full-precision models are inaccessible. In ZSQ, network …
networks when the data for training full-precision models are inaccessible. In ZSQ, network …
{DVABatch}: Diversity-aware {Multi-Entry}{Multi-Exit} batching for efficient processing of {DNN} services on {GPUs}
The DNN inferences are often batched for better utilizing the hardware in existing DNN
serving systems. However, DNN serving exhibits diversity in many aspects, such as input …
serving systems. However, DNN serving exhibits diversity in many aspects, such as input …
Outlier suppression+: Accurate quantization of large language models by equivalent and effective shifting and scaling
X Wei, Y Zhang, Y Li, X Zhang, R Gong… - Proceedings of the …, 2023 - aclanthology.org
Post-training quantization (PTQ) of transformer language models faces significant
challenges due to the existence of detrimental outliers in activations. We observe that these …
challenges due to the existence of detrimental outliers in activations. We observe that these …