ReLU Wins: Discovering Efficient Activation Functions for Sparse LLMs Z Zhang, Y Song, G Yu, X Han, Y Lin, C Xiao, C Song, Z Liu, Z Mi, M Sun
arXiv preprint arXiv:2402.03804, 2024
22 2024 ConPET: Continual Parameter-Efficient Tuning for Large Language Models C Song, X Han, Z Zeng, K Li, C Chen, Z Liu, M Sun, T Yang
arXiv preprint arXiv:2309.14763, 2023
20 2023 ProSparse: Introducing and Enhancing Intrinsic Activation Sparsity within Large Language Models C Song, X Han, Z Zhang, S Hu, X Shi, K Li, C Chen, Z Liu, G Li, T Yang, ...
arXiv preprint arXiv:2402.13516, 2024
18 2024 Configurable Foundation Models: Building LLMs from a Modular Perspective C Xiao, Z Zhang, C Song, D Jiang, F Yao, X Han, X Wang, S Wang, ...
arXiv preprint arXiv:2409.02877, 2024
9 2024 Multi-Modal Multi-Granularity Tokenizer for Chu Bamboo Slips Y Chen, C Hu, C Feng, C Song, S Yu, X Han, Z Liu, M Sun
Proceedings of the 31st International Conference on Computational …, 2025
2025 Sparsing Law: Towards Large Language Models with Greater Activation Sparsity Y Luo, C Song, X Han, Y Chen, C Xiao, Z Liu, M Sun
arXiv preprint arXiv:2411.02335, 2024
2024 Multi-Modal Multi-Granularity Tokenizer for Chu Bamboo Slip Scripts Y Chen, C Hu, C Feng, C Song, S Yu, X Han, Z Liu, M Sun
arXiv preprint arXiv:2409.01011, 2024
2024 Relation-aware deep neural network enables more efficient biomedical knowledge acquisition from massive literature C Song, Z Zeng, C Tian, K Li, Y Yao, S Zheng, Z Liu, M Sun
AI Open 5, 104-114, 2024
2024