Modeling Multi-Task Model Merging as Adaptive Projective Gradient Descent
Merging multiple expert models offers a promising approach for performing multi-task
learning without accessing their original data. Existing methods attempt to alleviate task …
learning without accessing their original data. Existing methods attempt to alleviate task …