Jakiro: Boosting Speculative Decoding with Decoupled Multi-Head via MoE H Huang, F Yang, Z Liu, Y Xu, J Li, Y Liu, X Yin, D Li, P Ren, E Barsoum arXiv preprint arXiv:2502.06282, 2025 | | 2025 |
Nearly Lossless Adaptive Bit Switching H Huang, Z Liu, T Xia, P Ren arXiv preprint arXiv:2502.01199, 2025 | | 2025 |
Partial Channel Network: Compute Fewer, Perform Better H Huang, T Xia, P Ren arXiv preprint arXiv:2502.01303, 2025 | | 2025 |
FTP: A Fine-grained Token-wise Pruner for Large Language Models via Token Routing Z Li, J Zheng, J Liu, H Liu, H Zhu, Z Li, F Yang, H Huang, J Peng, D Li, ... arXiv preprint arXiv:2412.11494, 2024 | | 2024 |
TAQ: Top-K Attention-Aware Quantization for Vision Transformers L Shi, H Huang, B Song, M Tan, W Zhao, T Xia, P Ren 2023 IEEE International Conference on Image Processing (ICIP), 1750-1754, 2023 | | 2023 |
An Outlier Detection Method Based On Symmetry and Curvature Threshold Q Dong, P Jiang, H Huang Proceedings of the 2020 4th International Conference on Video and Image …, 2020 | | 2020 |