RT-1: Robotics Transformer for Real-World Control at Scale
By transferring knowledge from large, diverse, task-agnostic datasets, modern
machine learning models can solve specific downstream tasks either zero-shot or
with small task-specific datasets to a high level of performance. While this
capability has been demonstrated in other fields such as computer vision,
natural language processing or speech recognition, it remains to be shown in
robotics, where the generalization capabilities of the models are particularly
critical due to the difficulty of collecting real-world robotic data. We argue
that one of the keys to the success of such general robotic models lies with
open-ended task-agnostic training, combined with high-capacity architectures
that can absorb all of the diverse, robotic data. In this paper, we present a
model class, dubbed Robotics Transformer, that exhibits promising scalable
model properties. We verify our conclusions in a study of different model
classes and their ability to generalize as a function of the data size, model
size, and data diversity based on a large-scale data collection on real robots
performing real-world tasks. The project's website and videos can be found at
robotics-transformer.github.io
Authors
Anthony Brohan, Noah Brown, Justice Carbajal, Yevgen Chebotar, Joseph Dabis, Chelsea Finn, Keerthana Gopalakrishnan, Karol Hausman, Alex Herzog, Jasmine Hsu, Julian Ibarz, Brian Ichter, Alex Irpan, Tomas Jackson, Sally Jesmonth, Nikhil J Joshi, Ryan Julian, Dmitry Kalashnikov, Yuheng Kuang, Isabel Leal