Parameter-efficient fine-tuning methods for pretrained language models: A critical review and assessment

L Xu, H **e, SZJ Qin, X Tao, FL Wang - arxiv preprint arxiv:2312.12148, 2023 - arxiv.org
With the continuous growth in the number of parameters of transformer-based pretrained
language models (PLMs), particularly the emergence of large language models (LLMs) with …

Recent advances in generative ai and large language models: Current status, challenges, and perspectives

DH Hagos, R Battle, DB Rawat - IEEE Transactions on Artificial …, 2024 - ieeexplore.ieee.org
The emergence of generative artificial intelligence (AI) and large language models (LLMs)
has marked a new era of natural language processing (NLP), introducing unprecedented …

Evolutionary optimization of model merging recipes

T Akiba, M Shing, Y Tang, Q Sun, D Ha - Nature Machine Intelligence, 2025 - nature.com
Large language models (LLMs) have become increasingly capable, but their development
often requires substantial computational resources. Although model merging has emerged …

Sam-clip: Merging vision foundation models towards semantic and spatial understanding

H Wang, PKA Vasu, F Faghri… - Proceedings of the …, 2024 - openaccess.thecvf.com
The landscape of publicly available vision foundation models (VFMs) such as CLIP and
SAM is expanding rapidly. VFMs are endowed with distinct capabilities stemming from their …

Language models are super mario: Absorbing abilities from homologous models as a free lunch

L Yu, B Yu, H Yu, F Huang, Y Li - Forty-first International Conference …, 2024 - openreview.net
In this paper, we unveil that Language Models (LMs) can acquire new capabilities by
assimilating parameters from homologous models without retraining or GPUs. We first …

Task arithmetic in the tangent space: Improved editing of pre-trained models

G Ortiz-Jimenez, A Favero… - Advances in Neural …, 2024 - proceedings.neurips.cc
Task arithmetic has recently emerged as a cost-effective and scalable approach to edit pre-
trained models directly in weight space: By adding the fine-tuned weights of different tasks …

Model merging in llms, mllms, and beyond: Methods, theories, applications and opportunities

E Yang, L Shen, G Guo, X Wang, X Cao… - arxiv preprint arxiv …, 2024 - arxiv.org
Model merging is an efficient empowerment technique in the machine learning community
that does not require the collection of raw training data and does not require expensive …

Model stock: All we need is just a few fine-tuned models

DH Jang, S Yun, D Han - European Conference on Computer Vision, 2024 - Springer
This paper introduces an efficient fine-tuning method for large pre-trained models, offering
strong in-distribution (ID) and out-of-distribution (OOD) performance. Breaking away from …

Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling

D Kim, C Park, S Kim, W Lee, W Song, Y Kim… - arxiv preprint arxiv …, 2023 - arxiv.org
We introduce SOLAR 10.7 B, a large language model (LLM) with 10.7 billion parameters,
demonstrating superior performance in various natural language processing (NLP) tasks …

Zipit! merging models from different tasks without training

G Stoica, D Bolya, J Bjorner, P Ramesh… - arxiv preprint arxiv …, 2023 - arxiv.org
Typical deep visual recognition models are capable of performing the one task they were
trained on. In this paper, we tackle the extremely difficult problem of combining completely …