Finetuning to unseen tasks in pretrained language models
Scaling Instruction-Finetuned Language Models
Instruction finetuning is a general method for improving the performance and usability of pretrained language models.
We find that instruction finetuning with the above aspects dramatically improves performance on a variety of model classes (palm, t5, u-palm, prompting setups (zero-shot, few-shot, cot), and evaluation benchmarks (mmlu, bbh, tydiqa, mgsm, open-ended generation).
For instance, flan-palm 540b instruction-finetuned on 1.8k tasks outperforms palm540b by a large margin (9.4% on average).
Authors
Hyung Won Chung, Le Hou, Shayne Longpre, Barret Zoph, Yi Tay, William Fedus, Eric Li, Xuezhi Wang, Mostafa Dehghani, Siddhartha Brahma, Albert Webson, Shixiang Shane Gu, Zhuyun Dai, Mirac Suzgun, Xinyun Chen, Aakanksha Chowdhery, Sharan Narang, Gaurav Mishra, Adams Yu, Vincent Zhao