What Matters for Model Merging at Scale?
Model merging aims to combine multiple expert models into a more capable single model,
offering benefits such as reduced storage and serving costs, improved generalization, and …
offering benefits such as reduced storage and serving costs, improved generalization, and …
Smile: Zero-shot sparse mixture of low-rank experts construction from pre-trained foundation models
Deep model training on extensive datasets is increasingly becoming cost-prohibitive,
prompting the widespread adoption of deep model fusion techniques to leverage knowledge …
prompting the widespread adoption of deep model fusion techniques to leverage knowledge …
Compress then serve: Serving thousands of lora adapters with little overhead
Fine-tuning large language models (LLMs) with low-rank adaptations (LoRAs) has become
common practice, often yielding numerous copies of the same LLM differing only in their …
common practice, often yielding numerous copies of the same LLM differing only in their …
Efficient and effective weight-ensembling mixture of experts for multi-task model merging
Multi-task learning (MTL) leverages a shared model to accomplish multiple tasks and
facilitate knowledge transfer. Recent research on task arithmetic-based MTL demonstrates …
facilitate knowledge transfer. Recent research on task arithmetic-based MTL demonstrates …
Sok: On finding common ground in loss landscapes using deep model merging techniques
Understanding neural networks is crucial to creating reliable and trustworthy deep learning
models. Most contemporary research in interpretability analyzes just one model at a time via …
models. Most contemporary research in interpretability analyzes just one model at a time via …
Merging loras like playing lego: Pushing the modularity of lora to extremes through rank-wise clustering
Low-Rank Adaptation (LoRA) has emerged as a popular technique for fine-tuning large
language models (LLMs) to various domains due to its modular design and widespread …
language models (LLMs) to various domains due to its modular design and widespread …
SurgeryV2: Bridging the Gap Between Model Merging and Multi-Task Learning with Deep Representation Surgery
Model merging-based multitask learning (MTL) offers a promising approach for performing
MTL by merging multiple expert models without requiring access to raw training data …
MTL by merging multiple expert models without requiring access to raw training data …
Collective model intelligence requires compatible specialization
In this work, we explore the limitations of combining models by averaging intermediate
features, referred to as model merging, and propose a new direction for achieving collective …
features, referred to as model merging, and propose a new direction for achieving collective …
How to Merge Your Multimodal Models Over Time?
Model merging combines multiple expert models-finetuned from a base foundation model
on diverse tasks and domains-into a single, more capable model. However, most existing …
on diverse tasks and domains-into a single, more capable model. However, most existing …
Merging Models on the Fly Without Retraining: A Sequential Approach to Scalable Continual Model Merging
Deep model merging represents an emerging research direction that combines multiple fine-
tuned models to harness their specialized capabilities across different tasks and domains …
tuned models to harness their specialized capabilities across different tasks and domains …