Promptagator: Few-shot Dense Retrieval From 8 Examples
Many recent research on information retrieval has focused on how to transfer from one task (typically with abundant supervised data) to various other tasks where supervision is limited, with the implicit assumption that it is possibleto generalize from one task to all the rest.
However, this overlooks the fact that there are many diverse and unique retrieval tasks, each targeting different search intents, queries, and search domains.
In this paper, we suggest to work on few-shot dense retrieval, a setting where each task comeswith a short description and a few examples.
To amplify the power of a fewexamples, we propose prompt-base query generation for retriever (promptagator), which leverages large language models (llm) as a few-shot query generator, and creates task-specific retrievers based on the generated data.
Powered by llm s generalization ability, promptagator makes it possible to create task-specific end-to-end retrievers solely based on a few examples {without} using natural questions or ms marco to train %question generators or dual encoders.
Authors
Zhuyun Dai, Vincent Y. Zhao, Ji Ma, Yi Luan, Jianmo Ni, Jing Lu, Anton Bakalov, Kelvin Guu, Keith B. Hall, Ming-Wei Chang