- Academic Search

E Yang, L Shen, G Guo, X Wang, X Cao… - arxiv preprint arxiv …, 2024 - arxiv.org

Model merging is an efficient empowerment technique in the machine learning community
that does not require the collection of raw training data and does not require expensive …

Enregistrer Citer Cité 42 fois Autres articles Les 2 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

Deep model fusion: A survey

W Li, Y Peng, M Zhang, L Ding, H Hu… - arxiv preprint arxiv …, 2023 - arxiv.org

Deep model fusion/merging is an emerging technique that merges the parameters or
predictions of multiple deep learning models into a single one. It combines the abilities of …

Enregistrer Citer Cité 53 fois Autres articles Les 2 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

Transformer fusion with optimal transport

M Imfeld, J Graldi, M Giordano, T Hofmann… - arxiv preprint arxiv …, 2023 - arxiv.org

Fusion is a technique for merging multiple independently-trained neural networks in order to
combine their capabilities. Past attempts have been restricted to the case of fully-connected …

Enregistrer Citer Cité 17 fois Autres articles Les 3 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

Sparse model soups: A recipe for improved pruning via model averaging

M Zimmer, C Spiegel, S Pokutta - arxiv preprint arxiv:2306.16788, 2023 - arxiv.org

Neural networks can be significantly compressed by pruning, leading to sparse models
requiring considerably less storage and floating-point operations while maintaining …

Enregistrer Citer Cité 16 fois Autres articles Les 4 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

Localize-and-stitch: Efficient model merging via sparse task arithmetic

Y He, Y Hu, Y Lin, T Zhang, H Zhao - arxiv preprint arxiv:2408.13656, 2024 - arxiv.org

Model merging offers an effective strategy to combine the strengths of multiple finetuned
models into a unified model that preserves the specialized capabilities of each. Existing …

Enregistrer Citer Cité 4 fois Autres articles Les 2 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

Training neural networks from scratch with parallel low-rank adapters

M Huh, B Cheung, J Bernstein, P Isola… - arxiv preprint arxiv …, 2024 - arxiv.org

The scalability of deep learning models is fundamentally limited by computing resources,
memory, and communication. Although methods like low-rank adaptation (LoRA) have …

Enregistrer Citer Cité 8 fois Autres articles Les 3 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

Cool-fusion: Fuse large language models without training

C Liu, X Quan, Y Pan, L Lin, W Wu, X Chen - arxiv preprint arxiv …, 2024 - arxiv.org

We focus on the problem of fusing two or more heterogeneous large language models
(LLMs) to facilitate their complementary strengths. One of the challenges on model fusion is …

Enregistrer Citer Cité 1 fois Autres articles Les 3 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

Atm: Improving model merging by alternating tuning and merging

L Zhou, D Solombrino, D Crisostomi… - arxiv preprint arxiv …, 2024 - arxiv.org

Model merging has recently emerged as a cost-efficient paradigm for multi-task learning.
Among current approaches, task arithmetic stands out for its simplicity and effectiveness. In …

Enregistrer Citer Cité 1 fois Autres articles Les 2 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] mlr.press

How Good is a Single Basin?

K Lion, L Noci, T Hofmann… - … Conference on Artificial …, 2024 - proceedings.mlr.press

The multi-modal nature of neural loss landscapes is often considered to be the main driver
behind the empirical success of deep ensembles. In this work, we probe this belief by …

Enregistrer Citer Cité 2 fois Autres articles Les 4 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

A Second-Order perspective on Compositionality and Incremental Learning

A Porrello, L Bonicelli, P Buzzega, M Millunzi… - arxiv preprint arxiv …, 2024 - arxiv.org

The fine-tuning of deep pre-trained models has recently revealed compositional properties.
This enables the arbitrary composition of multiple specialized modules into a single, multi …

Enregistrer Citer Cité 4 fois Autres articles Les 2 versions Free GPT-4 Version HTML

Créer l'alerte

Citer

Recherche avancée

Enregistré dans Ma bibliothèque

Population parameter averaging (papa)

Model merging in llms, mllms, and beyond: Methods, theories, applications and opportunities

Deep model fusion: A survey

Transformer fusion with optimal transport

Sparse model soups: A recipe for improved pruning via model averaging

Localize-and-stitch: Efficient model merging via sparse task arithmetic

Training neural networks from scratch with parallel low-rank adapters

Cool-fusion: Fuse large language models without training

Atm: Improving model merging by alternating tuning and merging

How Good is a Single Basin?

A Second-Order perspective on Compositionality and Incremental Learning