We present a family of transformer-based neural language models specialized for dialog, which have up to 137b parameters and are pre-trained on 1.56t words of public dialog data and web text.
We demonstrate that fine-tuning with annotated data and enabling the model to consult external knowledge sources can lead to significant improvements towards the two key challenges of safety and factual grounding.
The first challenge, safety, involves ensuring that the model s responses are consistent with a set of human values, such as preventing harmful suggestions and unfair bias.
We quantify safety using a metric based on an illustrative set of human values, and we find that filtering candidate responses using a classifier fine-tuned with a small amount of crowdworker-annotated data offers a promising approach to improving model safety.
The second challenge, factual grounding, involves enabling the model to consult external knowledge sources, such as an information retrieval system, a language translator, and a calculator.
We quantify factuality using a groundedness metric, and we find that our approach enables the model to generate responses grounded in known sources, rather than responses that merely sound plausible.
Finally, we explore the use of the classifier in the domains of education and content recommendations, and analyze their helpfulness and role consistency.
Authors
Romal Thoppilan, Daniel De Freitas, Jamie Hall, Noam Shazeer, Apoorv Kulshreshtha, Heng-Tze Cheng, Alicia Jin, Taylor Bos, Leslie Baker, Yu Du, YaGuang Li, Hongrae Lee, Huaixiu Steven Zheng, Amin Ghafouri, Marcelo Menegali, Yanping Huang, Maxim Krikun, Dmitry Lepikhin, James Qin, Dehao Chen