Model merging in llms, mllms, and beyond: Methods, theories, applications and opportunities
Model merging is an efficient empowerment technique in the machine learning community
that does not require the collection of raw training data and does not require expensive …
that does not require the collection of raw training data and does not require expensive …
Deep model fusion: A survey
Deep model fusion/merging is an emerging technique that merges the parameters or
predictions of multiple deep learning models into a single one. It combines the abilities of …
predictions of multiple deep learning models into a single one. It combines the abilities of …
Transformer fusion with optimal transport
Fusion is a technique for merging multiple independently-trained neural networks in order to
combine their capabilities. Past attempts have been restricted to the case of fully-connected …
combine their capabilities. Past attempts have been restricted to the case of fully-connected …
Sparse model soups: A recipe for improved pruning via model averaging
Neural networks can be significantly compressed by pruning, leading to sparse models
requiring considerably less storage and floating-point operations while maintaining …
requiring considerably less storage and floating-point operations while maintaining …
Training neural networks from scratch with parallel low-rank adapters
The scalability of deep learning models is fundamentally limited by computing resources,
memory, and communication. Although methods like low-rank adaptation (LoRA) have …
memory, and communication. Although methods like low-rank adaptation (LoRA) have …
Localize-and-stitch: Efficient model merging via sparse task arithmetic
Model merging offers an effective strategy to combine the strengths of multiple finetuned
models into a unified model that preserves the specialized capabilities of each. Existing …
models into a unified model that preserves the specialized capabilities of each. Existing …
Cool-fusion: Fuse large language models without training
We focus on the problem of fusing two or more heterogeneous large language models
(LLMs) to facilitate their complementary strengths. One of the challenges on model fusion is …
(LLMs) to facilitate their complementary strengths. One of the challenges on model fusion is …
Atm: Improving model merging by alternating tuning and merging
Model merging has recently emerged as a cost-efficient paradigm for multi-task learning.
Among current approaches, task arithmetic stands out for its simplicity and effectiveness. In …
Among current approaches, task arithmetic stands out for its simplicity and effectiveness. In …
How Good is a Single Basin?
The multi-modal nature of neural loss landscapes is often considered to be the main driver
behind the empirical success of deep ensembles. In this work, we probe this belief by …
behind the empirical success of deep ensembles. In this work, we probe this belief by …
A Second-Order perspective on Compositionality and Incremental Learning
The fine-tuning of deep pre-trained models has recently revealed compositional properties.
This enables the arbitrary composition of multiple specialized modules into a single, multi …
This enables the arbitrary composition of multiple specialized modules into a single, multi …