- Academic Search

What Matters for Model Merging at Scale?

P Yadav, T Vu, J Lai, A Chronopoulou… - arxiv preprint arxiv …, 2024 - arxiv.org

Model merging aims to combine multiple expert models into a more capable single model,
offering benefits such as reduced storage and serving costs, improved generalization, and …

Save Cite Cited by 6 Related articles View as HTML

From lists to emojis: How format bias affects model alignment

X Zhang, W **ong, L Chen, T Zhou, H Huang… - arxiv preprint arxiv …, 2024 - arxiv.org

In this paper, we study format biases in reinforcement learning from human feedback
(RLHF). We observe that many widely-used preference models, including human …

Save Cite Cited by 4 Related articles View as HTML

How to Merge Your Multimodal Models Over Time?

S Dziadzio, V Udandarao, K Roth, A Prabhu… - arxiv preprint arxiv …, 2024 - arxiv.org

Model merging combines multiple expert models-finetuned from a base foundation model
on diverse tasks and domains-into a single, more capable model. However, most existing …

Parameter-Efficient Interventions for Enhanced Model Merging

M Osial, D Marczak, B Zieliński - arxiv preprint arxiv:2412.17023, 2024 - arxiv.org

Model merging combines knowledge from task-specific models into a unified multi-task
model to avoid joint training on all task data. However, current methods face challenges due …

If You Can't Use Them, Recycle Them: Optimizing Merging at Scale Mitigates Performance Tradeoffs

M Khalifa, YC Tan, A Ahmadian, T Hosking… - arxiv preprint arxiv …, 2024 - arxiv.org

Model merging has shown great promise at combining expert models, but the benefit of
merging is unclear when merging``generalist''models trained on many tasks. We explore …

[PDF] openreview.net

Domain Adaptation for Robust Model Routing

C Dann, Y Mansour, TV Marinov, M Mohri - Adaptive Foundation Models … - openreview.net

The rapid proliferation of domain-specialized machine learning models presents a
challenge: while individual models excel in specific domains, their performance varies …

[PDF] openreview.net

LOCMAP: LOW-COMPUTE MODEL MERGING WITH AMORTIZED PARETO FRONTS VIA QUADRATIC APPROXIMATION

AP FRONTS - openreview.net

Model merging has emerged as an effective approach to combine multiple single-task
models into a multitask model. This process typically involves computing a weighted …