- Academic Search

Parameter-efficient fine-tuning methods for pretrained language models: A critical review and assessment

L Xu, H **e, SZJ Qin, X Tao, FL Wang - arxiv preprint arxiv:2312.12148, 2023 - arxiv.org

With the continuous growth in the number of parameters of transformer-based pretrained
language models (PLMs), particularly the emergence of large language models (LLMs) with …

Enregistrer Citer Cité 174 fois Autres articles Les 2 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

Recent advances in generative ai and large language models: Current status, challenges, and perspectives

DH Hagos, R Battle, DB Rawat - IEEE Transactions on Artificial …, 2024 - ieeexplore.ieee.org

The emergence of generative artificial intelligence (AI) and large language models (LLMs)
has marked a new era of natural language processing (NLP), introducing unprecedented …

Enregistrer Citer Cité 16 fois Autres articles Les 4 versions Free GPT-4

[Free GPT-4]

[PDF] nature.com

Evolutionary optimization of model merging recipes

T Akiba, M Shing, Y Tang, Q Sun, D Ha - Nature Machine Intelligence, 2025 - nature.com

Large language models (LLMs) have become increasingly capable, but their development
often requires substantial computational resources. Although model merging has emerged …

Enregistrer Citer Cité 73 fois Autres articles Les 3 versions Free GPT-4

[Free GPT-4]

[PDF] thecvf.com

Sam-clip: Merging vision foundation models towards semantic and spatial understanding

H Wang, PKA Vasu, F Faghri… - Proceedings of the …, 2024 - openaccess.thecvf.com

The landscape of publicly available vision foundation models (VFMs) such as CLIP and
SAM is expanding rapidly. VFMs are endowed with distinct capabilities stemming from their …

Enregistrer Citer Cité 123 fois Autres articles Les 7 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] openreview.net

Language models are super mario: Absorbing abilities from homologous models as a free lunch

L Yu, B Yu, H Yu, F Huang, Y Li - Forty-first International Conference …, 2024 - openreview.net

In this paper, we unveil that Language Models (LMs) can acquire new capabilities by
assimilating parameters from homologous models without retraining or GPUs. We first …

Enregistrer Citer Cité 168 fois Autres articles Les 3 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] neurips.cc

Task arithmetic in the tangent space: Improved editing of pre-trained models

G Ortiz-Jimenez, A Favero… - Advances in Neural …, 2024 - proceedings.neurips.cc

Task arithmetic has recently emerged as a cost-effective and scalable approach to edit pre-
trained models directly in weight space: By adding the fine-tuned weights of different tasks …

Enregistrer Citer Cité 91 fois Autres articles Les 6 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

Model merging in llms, mllms, and beyond: Methods, theories, applications and opportunities

E Yang, L Shen, G Guo, X Wang, X Cao… - arxiv preprint arxiv …, 2024 - arxiv.org

Model merging is an efficient empowerment technique in the machine learning community
that does not require the collection of raw training data and does not require expensive …

Enregistrer Citer Cité 42 fois Autres articles Les 2 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

Model stock: All we need is just a few fine-tuned models

DH Jang, S Yun, D Han - European Conference on Computer Vision, 2024 - Springer

This paper introduces an efficient fine-tuning method for large pre-trained models, offering
strong in-distribution (ID) and out-of-distribution (OOD) performance. Breaking away from …

Enregistrer Citer Cité 15 fois Autres articles Les 2 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling

D Kim, C Park, S Kim, W Lee, W Song, Y Kim… - arxiv preprint arxiv …, 2023 - arxiv.org

We introduce SOLAR 10.7 B, a large language model (LLM) with 10.7 billion parameters,
demonstrating superior performance in various natural language processing (NLP) tasks …

Enregistrer Citer Cité 78 fois Autres articles Les 2 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

Zipit! merging models from different tasks without training

G Stoica, D Bolya, J Bjorner, P Ramesh… - arxiv preprint arxiv …, 2023 - arxiv.org

Typical deep visual recognition models are capable of performing the one task they were
trained on. In this paper, we tackle the extremely difficult problem of combining completely …

Enregistrer Citer Cité 77 fois Autres articles Les 5 versions Free GPT-4 Version HTML

Citer

Recherche avancée

Enregistré dans Ma bibliothèque

Parameter-efficient fine-tuning methods for pretrained language models: A critical review and assessment

Recent advances in generative ai and large language models: Current status, challenges, and perspectives

Evolutionary optimization of model merging recipes

Sam-clip: Merging vision foundation models towards semantic and spatial understanding

Language models are super mario: Absorbing abilities from homologous models as a free lunch

Task arithmetic in the tangent space: Improved editing of pre-trained models

Model merging in llms, mllms, and beyond: Methods, theories, applications and opportunities

Model stock: All we need is just a few fine-tuned models

Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling

Zipit! merging models from different tasks without training