Reinforcement Learning with Neural Radiance Fields
It is a long-standing problem to find effective representations for training
reinforcement learning (RL) agents. This paper demonstrates that learning state
representations with supervision from Neural Radiance Fields (NeRFs) can
improve the performance of RL compared to other learned representations or even
low-dimensional, hand-engineered state information. Specifically, we propose to
train an encoder that maps multiple image observations to a latent space
describing the objects in the scene. The decoder built from a
latent-conditioned NeRF serves as the supervision signal to learn the latent
space. An RL algorithm then operates on the learned latent space as its state
representation. We call this NeRF-RL. Our experiments indicate that NeRF as
supervision leads to a latent space better suited for the downstream RL tasks
involving robotic object manipulations like hanging mugs on hooks, pushing
objects, or opening doors. Video: this https URL
Authors
Danny Driess, Ingmar Schubert, Pete Florence, Yunzhu Li, Marc Toussaint