SFT: Efficient, Scalable and Generalizable LLM Fine-tuning by Structured Sparsity

X Yang, J Leng, G Guo, J Zhao, R Nakada… - arxiv preprint arxiv …, 2024 - arxiv.org
Current PEFT methods for LLMs can achieve either high quality, efficient training, or
scalable serving, but not all three simultaneously. To address this limitation, we investigate …

Is Parameter Collision Hindering Continual Learning in LLMs?

S Yang, KP Ning, YY Liu, JY Yao, YH Tian… - arxiv preprint arxiv …, 2024 - arxiv.org
Large Language Models (LLMs) often suffer from catastrophic forgetting when learning
multiple tasks sequentially, making continual learning (CL) essential for their dynamic …