ExMix: A Massive Collection of Tasks for Multi-Task Pre-training
ExT5: Towards Extreme Multi-Task Scaling for Transfer Learning
This paper introduces a massive collection of 107 supervised natural language processing (nlp) tasks across diverse domains and task-families.
Using this dataset, we study the effect of scaling up the number of tasks during pre-training at the largest scale to date, and analyze co-training transfer amongst common families of tasks.
Finally, we propose a model pre-trained using a multi-task objective of self-supervised span denoising and supervised exmix (extreme mixture).
Via extensive experiments, we show that this model outperforms strong t5 baselines on superglue, gem, rainbow, closed-book qa tasks, and several tasks outside of exmix.
Authors
Vamsi Aribandi, Yi Tay, Tal Schuster, Jinfeng Rao, Huaixiu Steven Zheng, Sanket Vaibhav Mehta, Honglei Zhuang, Vinh Q. Tran, Dara Bahri, Jianmo Ni, Jai Gupta, Kai Hui, Sebastian Ruder, Donald Metzler