QDrop: randomly dropping quantization for extremely low-bit post-training quantization X Wei, R Gong, Y Li, X Liu, F Yu
International Conference on Learning Representations 2022, 2022
164 2022 Outlier suppression: Pushing the limit of low-bit transformer language models X Wei, Y Zhang, X Zhang, R Gong, S Zhang, Q Zhang, F Yu, X Liu
Advances in Neural Information Processing Systems 35, 17402-17414, 2022
142 2022 Outlier Suppression+: Accurate quantization of large language models by equivalent and optimal shifting and scaling X Wei, Y Zhang, Y Li, X Zhang, R Gong, J Guo, X Liu
The 2023 Conference on Empirical Methods in Natural Language Processing, 2023
94 2023 Qllm: Accurate and efficient low-bitwidth quantization for large language models J Liu, R Gong, X Wei, Z Dong, J Cai, B Zhuang
arXiv preprint arXiv:2310.08041, 2023
61 2023 Lossy and Lossless (L ) Post-training Model Size Compression Y Shi, S Bai, X Wei, R Gong, J Yang
arXiv preprint arXiv:2308.04269, 2023
6 2023 Investigating Low-Rank Training in Transformer Language Models: Efficiency and Scaling Analysis X Wei, S Moalla, R Pascanu, C Gulcehre
NGSM workshop at ICML Conference 2023, 2024
2 2024 Building on Efficient Foundations: Effectively Training LLMs with Structured Feedforward Layers X Wei, S Moalla, R Pascanu, C Gulcehre
Advances in Neural Information Processing Systems, 2024, 2024
2024