We propose a novel post-processing approach, rethinking withretrieval (rr), which retrieves relevant external knowledge based on the decomposed reasoning steps obtained from the chain-of-thought (cot) prompting.
This lightweight approach does not require additional training or fine-tuning and is not limited by the input length of large language models (llms).
We present speculative sampling, an algorithm for accelerating transformer decoding by enabling the generation of multiple tokens from each transformer call.
Our algorithm relies on the observation that the latency of parallel scoring of short continuations, generated by a faster but less powerful draftmodel, is comparable to that of sampling a single token from the larger targetmodel.
We introduce a large language model that can store, combine and reason about scientific knowledge.
We train on a large scientific corpus of papers, reference material, knowledgebases and many other sources and outperform existing models on a range of scientific tasks.
Unit tests play a key role in ensuring the correctness of software. However,
manually creating unit tests is a laborious task, motivating the need for
automation. This paper presents TestPilot, an ada
We propose batchprompting, a simple alternative prompting approach that enables the large language model to run inference in batches, instead of one sample at a time.
Our method reduces both token and time costs while retaining downstream performance.
Can large language models be trained to produce philosophical texts that are
difficult to distinguish from texts produced by human philosophers? To address
this question, we fine-tuned OpenAI's GPT-3
Research on prompting has shown excellent performance with little or even no
supervised training across many tasks. However, prompting for machine
translation is still under-explored in the literature
We explore the authoring process for building prototypes for a range of applications, and conclude with open questions on scaling chains to complex tasks, and supporting low-fi chain prototyping.
Large language models (such as OpenAI's Codex) have demonstrated impressive
zero-shot multi-task capabilities in the software domain, including code
explanation. In this work, we examine if this abili
We propose prompt tuning with model tuning (promot), a simple yet effective two-stage fine-tuning framework that preserves in-context abilities of the pretrained model.
We fine-tune mt5 xxl on natural language inference and english-french translation and evaluate the in-context abilities of the resulting models on 8 different natural language inference tasks.
Large Language Models (LLMs) like GPT-3 have sparked significant interest in
their generative capabilities, leading to the development of various commercial
applications. The high cost of using the mo
Advances in Deep Learning have led to the emergence of Large Language Models
(LLMs) such as OpenAI Codex which powers GitHub Copilot. LLMs have been fine
tuned and packaged so that programmers can use
Embodied Instruction Following (EIF) studies how mobile manipulator robots
should be controlled to accomplish long-horizon tasks specified by natural
language instructions. While most research on EIF
Recent pretrained language models extend from millions to billions of
parameters. Thus the need to fine-tune an extremely large pretrained model with
a limited training corpus arises in various downst
Large language models (LLMs) have recently been applied in software
engineering to perform tasks such as translating code between programming
languages, generating code from natural language, and auto
Recent works have shown that unstructured text (documents) from online
sources can serve as useful auxiliary information for zero-shot image
classification. However, these methods require access to a
The BigScience Workshop was a value-driven initiative that spanned one and
half years of interdisciplinary research and culminated in the creation of
ROOTS, a 1.6TB multilingual dataset that was used
How humans infer discrete emotions is a fundamental research question in the
field of psychology. While conceptual knowledge about emotions (emotion
knowledge) has been suggested to be essential for e
Large language models (llm) have demonstrated impressive potential on simple tasks, but their breadth of scope, lack of transparency, and insufficient control can make them less effective when assisting humans on more complex tasks.
In response, we introduce the concept of chaining llm steps together, where the output of one step becomes the input for the next, thus aggregating the gains per step.