Parameter-efficient fine-tuning methods for pretrained language models: A critical review and assessment
With the continuous growth in the number of parameters of transformer-based pretrained
language models (PLMs), particularly the emergence of large language models (LLMs) with …
language models (PLMs), particularly the emergence of large language models (LLMs) with …
Recent advances in generative ai and large language models: Current status, challenges, and perspectives
The emergence of generative artificial intelligence (AI) and large language models (LLMs)
has marked a new era of natural language processing (NLP), introducing unprecedented …
has marked a new era of natural language processing (NLP), introducing unprecedented …
Evolutionary optimization of model merging recipes
Large language models (LLMs) have become increasingly capable, but their development
often requires substantial computational resources. Although model merging has emerged …
often requires substantial computational resources. Although model merging has emerged …
Sam-clip: Merging vision foundation models towards semantic and spatial understanding
The landscape of publicly available vision foundation models (VFMs) such as CLIP and
SAM is expanding rapidly. VFMs are endowed with distinct capabilities stemming from their …
SAM is expanding rapidly. VFMs are endowed with distinct capabilities stemming from their …
Language models are super mario: Absorbing abilities from homologous models as a free lunch
In this paper, we unveil that Language Models (LMs) can acquire new capabilities by
assimilating parameters from homologous models without retraining or GPUs. We first …
assimilating parameters from homologous models without retraining or GPUs. We first …
Task arithmetic in the tangent space: Improved editing of pre-trained models
Task arithmetic has recently emerged as a cost-effective and scalable approach to edit pre-
trained models directly in weight space: By adding the fine-tuned weights of different tasks …
trained models directly in weight space: By adding the fine-tuned weights of different tasks …
Model merging in llms, mllms, and beyond: Methods, theories, applications and opportunities
Model merging is an efficient empowerment technique in the machine learning community
that does not require the collection of raw training data and does not require expensive …
that does not require the collection of raw training data and does not require expensive …
Model stock: All we need is just a few fine-tuned models
This paper introduces an efficient fine-tuning method for large pre-trained models, offering
strong in-distribution (ID) and out-of-distribution (OOD) performance. Breaking away from …
strong in-distribution (ID) and out-of-distribution (OOD) performance. Breaking away from …
Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling
We introduce SOLAR 10.7 B, a large language model (LLM) with 10.7 billion parameters,
demonstrating superior performance in various natural language processing (NLP) tasks …
demonstrating superior performance in various natural language processing (NLP) tasks …
Zipit! merging models from different tasks without training
Typical deep visual recognition models are capable of performing the one task they were
trained on. In this paper, we tackle the extremely difficult problem of combining completely …
trained on. In this paper, we tackle the extremely difficult problem of combining completely …