YOLOv6: A single-stage object detection framework for industrial applications C Li, L Li, H Jiang, K Weng, Y Geng, L Li, Z Ke, Q Li, M Cheng, W Nie, Y Li, ... arXiv preprint arXiv:2209.02976, 2022 | 2591 | 2022 |
Fast, accurate and lightweight super-resolution with neural architecture search X Chu, B Zhang, H Ma, R Xu, Q Li 2020 25th International conference on pattern recognition (ICPR), 59-64, 2021 | 322 | 2021 |
Norm tweaking: High-performance low-bit quantization of large language models L Li, Q Li, B Zhang, X Chu Proceedings of the AAAI Conference on Artificial Intelligence 38 (17), 18536 …, 2024 | 31 | 2024 |
Autokws: Keyword spotting with differentiable architecture search B Zhang, W Li, Q Li, W Zhuang, X Chu, Y Wang ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021 | 26 | 2021 |
Scarlet-nas: bridging the gap between stability and scalability in weight-sharing neural architecture search X Chu, B Zhang, Q Li, R Xu, X Li Proceedings of the IEEE/CVF International Conference on Computer Vision, 317-325, 2021 | 21 | 2021 |
Fptq: Fine-grained post-training quantization for large language models Q Li, Y Zhang, L Li, P Yao, B Zhang, X Chu, Y Sun, L Du, Y Xie arXiv preprint arXiv:2308.15987, 2023 | 16 | 2023 |
A speed odyssey for deployable quantization of llms Q Li, R Meng, Y Li, B Zhang, L Li, Y Lu, X Chu, Y Sun, Y Xie arXiv preprint arXiv:2311.09550, 2023 | 7 | 2023 |
Eapruning: evolutionary pruning for vision transformers and cnns Q Li, B Zhang, X Chu arXiv preprint arXiv:2210.00181, 2022 | 3 | 2022 |
Flash Communication: Reducing Tensor Parallelization Bottleneck for Fast Large Language Model Inference Q Li, B Zhang, L Ye, Y Zhang, W Wu, Y Sun, L Ma, Y Xie arXiv preprint arXiv:2412.04964, 2024 | | 2024 |
Integer Scale: A Free Lunch for Faster Fine-grained Quantization of LLMs Q Li, R Meng, Y Li, B Zhang, Y Lu, Y Sun, L Ma, Y Xie arXiv preprint arXiv:2405.14597, 2024 | | 2024 |