Model Merging: A Cost-effective Approach to Transfer Learning
Merging Models with Fisher-Weighted Averaging
Transfer learning is a powerful way of leveraging knowledge from one task when learning another task.
Our approach effectively involves computing a weighted averageof the models'parameters that is equivalent to approximately sampling from the posteriors of the model weights.
We demonstrate that model merging achieves comparable performance to gradient descent-based transfer learning on intermediate-task training and domain adaptation problems.
We also show that our mergingprocedure makes it possible to combine models in previously unexplored ways.