TART: A Multi-Task Retrieval System with Instructions
Task-aware Retrieval with Instructions
We study the problem of retrieval with instructions, where users of a retrieval system explicitly describe their intent along with their queries, making the system task-aware.
We aim to develop a general-purpose task-aware retrieval systems using multi-task instruction tuning that can follow human-written instructions to find the best documents for a given query.
To this end, we introduce the first large-scale collection of approximately 40 retrieval datasets with instructions, and present a multi-task retrieval system trained on the diverse retrieval tasks with instructions.
We further introduce a new evaluation setup to better reflect real-world scenarios, pooling diversedocuments and tasks, and demonstrate the effectiveness of guiding retrieval with instructions.
Authors
Akari Asai, Timo Schick, Patrick Lewis, Xilun Chen, Gautier Izacard, Sebastian Riedel, Hannaneh Hajishirzi, Wen-tau Yih
Information retrieval (ir) is the task of finding documents from a large collection of texts to fulfill a user s information need, typically expressed in the form of a textual query.
Given the same query, a user may want to retrieve a passage that describes how to do the task or to identify a similar query, or even to directly locate a code snippet.
The notion of relevance from the user s perspective (i.e.,) can be amorphous, and a query alone may not fully capture user information needs.
In addition to the query, the retrieval system is given a natural language description of the search task (i.e., an instruction) that describes the user s intent.
To facilitate research in retrieval with instructions, we introduce berri (ank of xplicit et ieval nstructions), a collection of approximately 40 retrieval datasets with instructions in a unified format.
We use berri to train an instruction-following single multi-task retrieval system that follows instructions to perform diverse tasks with different intents and is able to adapt to new retrieval tasks with no parameter updates.
Result
We developed instruction models and knowledge distillations for large-scale machine learning (llm) on millions of synthetically generated in-domain data.
Unlike prior methods that require additional data generation, our instruction models only require a single human-written instruction for each task at test time to adapt to a new task.
We found that even smaller models can be guided by instructions, although they may have limited capabilities of zero-shot transfer to new task due to the limited model capacity and limited interactions between the query and document embeddings.
Our instruction models do not simply exploit lexical matching.
Compared to other methods using cross-encoder-based reranking models (e.g., bm25 monot5), our models use a much smaller number of paragraphs to be re-ranked, which significantly reduces latency caused by reranking at test time.
We find that our instruction models can flexibly change its behavior based on the instructions, indicating that a model can flexibly adapt to new tasks.
Tart-full significantly outperforms larger models and customized models trained on millions of in-domain synthetically generated data, advancing the state of the art on beir and lotte.