Keep Up With Latest Trending Papers. Computer Science, AI and more.
Computation and Language
OFA: A Unified Multimodal Pretrained Model
Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
To a simple sequence-to-sequence learning framework based on the encoder-decoder architecture.
The proposed model performs pretraining and finetuning with task instructions and introduces no extra task-specific layers for finetuning.
Experimental results show that the proposed model achieves new state-of-the-arts on a series of multimodal tasks, including image captioning (test-std acc.
Peng Wang, An Yang, Rui Men, Junyang Lin, Shuai Bai, Zhikang Li, Jianxin Ma, Chang Zhou, Jingren Zhou, Hongxia Yang
Sequence-to - sequence learning framework
Unified multimodal pretrained model
Read the Paper
◐ Recommended Follows
◐ Latest Activity