- Academic Search

P Yadav, T Vu, J Lai, A Chronopoulou… - arxiv preprint arxiv …, 2024 - arxiv.org

Model merging aims to combine multiple expert models into a more capable single model,
offering benefits such as reduced storage and serving costs, improved generalization, and …

Zapisz Cytuj Cytowane przez 6 Powiązane artykuły Wszystkie wersje 2 Wersja HTML

[Free GPT-4]

[PDF] arxiv.org

Smile: Zero-shot sparse mixture of low-rank experts construction from pre-trained foundation models

A Tang, L Shen, Y Luo, S **e, H Hu, L Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org

Deep model training on extensive datasets is increasingly becoming cost-prohibitive,
prompting the widespread adoption of deep model fusion techniques to leverage knowledge …

Zapisz Cytuj Cytowane przez 2 Powiązane artykuły Wszystkie wersje 2 Wersja HTML

[Free GPT-4]

[PDF] arxiv.org

Compress then serve: Serving thousands of lora adapters with little overhead

R Brüel-Gabrielsson, J Zhu, O Bhardwaj… - arxiv preprint arxiv …, 2024 - arxiv.org

Fine-tuning large language models (LLMs) with low-rank adaptations (LoRAs) has become
common practice, often yielding numerous copies of the same LLM differing only in their …

Zapisz Cytuj Cytowane przez 3 Powiązane artykuły Wszystkie wersje 4 Wersja HTML

[Free GPT-4]

[PDF] arxiv.org

Efficient and effective weight-ensembling mixture of experts for multi-task model merging

L Shen, A Tang, E Yang, G Guo, Y Luo, L Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org

Multi-task learning (MTL) leverages a shared model to accomplish multiple tasks and
facilitate knowledge transfer. Recent research on task arithmetic-based MTL demonstrates …

Zapisz Cytuj Cytowane przez 2 Powiązane artykuły Wszystkie wersje 2 Wersja HTML

[Free GPT-4]

[PDF] arxiv.org

Sok: On finding common ground in loss landscapes using deep model merging techniques

A Khan, T Nief, N Hudson, M Sakarvadia… - arxiv preprint arxiv …, 2024 - arxiv.org

Understanding neural networks is crucial to creating reliable and trustworthy deep learning
models. Most contemporary research in interpretability analyzes just one model at a time via …

Zapisz Cytuj Cytowane przez 2 Powiązane artykuły Wszystkie wersje 2 Wersja HTML

[Free GPT-4]

[PDF] arxiv.org

Merging loras like playing lego: Pushing the modularity of lora to extremes through rank-wise clustering

Z Zhao, T Shen, D Zhu, Z Li, J Su, X Wang… - arxiv preprint arxiv …, 2024 - arxiv.org

Low-Rank Adaptation (LoRA) has emerged as a popular technique for fine-tuning large
language models (LLMs) to various domains due to its modular design and widespread …

Zapisz Cytuj Cytowane przez 3 Powiązane artykuły Wszystkie wersje 2 Wersja HTML

[Free GPT-4]

[PDF] arxiv.org

SurgeryV2: Bridging the Gap Between Model Merging and Multi-Task Learning with Deep Representation Surgery

E Yang, L Shen, Z Wang, G Guo, X Wang… - arxiv preprint arxiv …, 2024 - arxiv.org

Model merging-based multitask learning (MTL) offers a promising approach for performing
MTL by merging multiple expert models without requiring access to raw training data …

Zapisz Cytuj Cytowane przez 1 Powiązane artykuły Wszystkie wersje 2 Wersja HTML

[Free GPT-4]

[PDF] arxiv.org

Collective model intelligence requires compatible specialization

J Pari, S Jelassi, P Agrawal - arxiv preprint arxiv:2411.02207, 2024 - arxiv.org

In this work, we explore the limitations of combining models by averaging intermediate
features, referred to as model merging, and propose a new direction for achieving collective …

Zapisz Cytuj Cytowane przez 1 Powiązane artykuły Wszystkie wersje 2 Wersja HTML

[Free GPT-4]

[PDF] arxiv.org

How to Merge Your Multimodal Models Over Time?

S Dziadzio, V Udandarao, K Roth, A Prabhu… - arxiv preprint arxiv …, 2024 - arxiv.org

Model merging combines multiple expert models-finetuned from a base foundation model
on diverse tasks and domains-into a single, more capable model. However, most existing …

Zapisz Cytuj Powiązane artykuły Wszystkie wersje 2 Wersja HTML

[Free GPT-4]

[PDF] arxiv.org

Merging Models on the Fly Without Retraining: A Sequential Approach to Scalable Continual Model Merging

A Tang, E Yang, L Shen, Y Luo, H Hu, B Du… - arxiv preprint arxiv …, 2025 - arxiv.org

Deep model merging represents an emerging research direction that combines multiple fine-
tuned models to harness their specialized capabilities across different tasks and domains …

Zapisz Cytuj Powiązane artykuły Wszystkie wersje 2 Wersja HTML

Utwórz alert

Cytuj

Szukanie zaawansowane

Zapisano w Mojej bibliotece

A survey on model moerging: Recycling and routing among specialized experts for collaborative...

What Matters for Model Merging at Scale?

Smile: Zero-shot sparse mixture of low-rank experts construction from pre-trained foundation models

Compress then serve: Serving thousands of lora adapters with little overhead

Efficient and effective weight-ensembling mixture of experts for multi-task model merging

Sok: On finding common ground in loss landscapes using deep model merging techniques

Merging loras like playing lego: Pushing the modularity of lora to extremes through rank-wise clustering

SurgeryV2: Bridging the Gap Between Model Merging and Multi-Task Learning with Deep Representation Surgery

Collective model intelligence requires compatible specialization

How to Merge Your Multimodal Models Over Time?

Merging Models on the Fly Without Retraining: A Sequential Approach to Scalable Continual Model Merging