Learning from models beyond fine-tuning H Zheng, L Shen, A Tang, Y Luo, H Hu, B Du, Y Wen, D Tao Nature Machine Intelligence, 1-12, 2025 | 30* | 2025 |
Merging Multi-Task Models via Weight-Ensembling Mixture of Experts A Tang, L Shen, Y Luo, N Yin, L Zhang, D Tao The 41th International Conference on Machine Learning (ICML), 2024 | 22 | 2024 |
Parameter efficient multi-task model fusion with partial linearization A Tang, L Shen, Y Luo, Y Zhan, H Hu, B Du, Y Chen, D Tao the 12th International Conference on Learning Representations, 2024 | 16 | 2024 |
Concrete subspace learning based interference elimination for multi-task model fusion A Tang, L Shen, Y Luo, L Ding, H Hu, B Du, D Tao arXiv preprint arXiv:2312.06173, 2023 | 14 | 2023 |
Fusionbench: A comprehensive benchmark of deep model fusion A Tang, L Shen, Y Luo, H Hu, B Du, D Tao arXiv preprint arXiv:2406.03280, 2024 | 9 | 2024 |
Towards Efficient Pareto Set Approximation via Mixture of Experts Based Model Fusion A Tang, L Shen, Y Luo, S Liu, H Hu, B Du arXiv preprint arXiv:2406.09770, 2024 | 4 | 2024 |
Efficient and effective weight-ensembling mixture of experts for multi-task model merging L Shen, A Tang, E Yang, G Guo, Y Luo, L Zhang, X Cao, B Du, D Tao arXiv preprint arXiv:2410.21804, 2024 | 2 | 2024 |
Smile: Zero-shot sparse mixture of low-rank experts construction from pre-trained foundation models A Tang, L Shen, Y Luo, S Xie, H Hu, L Zhang, B Du, D Tao arXiv preprint arXiv:2408.10174, 2024 | 2 | 2024 |
Improving Heterogeneous Model Reuse by Density Estimation A Tang, Y Luo, H Hu, F He, K Su, B Du, Y Chen, D Tao Thirty-Second International Joint Conference on Artificial Intelligence, 2023 | 1 | 2023 |
Mitigating the Backdoor Effect for Multi-Task Model Merging via Safety-Aware Subspace J Yang, A Tang, D Zhu, Z Chen, L Shen, F Wu The 13th International Conference on Learning Representations (ICLR), 2025 | | 2025 |
Merging Models on the Fly Without Retraining: A Sequential Approach to Scalable Continual Model Merging A Tang, E Yang, L Shen, Y Luo, H Hu, B Du, D Tao arXiv preprint arXiv:2501.09522, 2025 | | 2025 |
Modeling Multi-Task Model Merging as Adaptive Projective Gradient Descent Y Wei, A Tang, L Shen, F Xiong, C Yuan, X Cao arXiv preprint arXiv:2501.01230, 2025 | | 2025 |
Targeted Low-rank Refinement: Enhancing Sparse Neural Networks with Precision L Shen, A Tang, X Ren, Y Luo, H Hu, X Cao | | |