A Survey of Generalisation in Deep Reinforcement Learning
The study of generalisation in deep Reinforcement Learning (RL) aims to
produce RL algorithms whose policies generalise well to novel unseen situations
at deployment time, avoiding overfitting to their training environments.
Tackling this is vital if we are to deploy reinforcement learning algorithms in
real world scenarios, where the environment will be diverse, dynamic and
unpredictable. This survey is an overview of this nascent field. We provide a
unifying formalism and terminology for discussing different generalisation
problems, building upon previous works. We go on to categorise existing
benchmarks for generalisation, as well as current methods for tackling the
generalisation problem. Finally, we provide a critical discussion of the
current state of the field, including recommendations for future work. Among
other conclusions, we argue that taking a purely procedural content generation
approach to benchmark design is not conducive to progress in generalisation, we
suggest fast online adaptation and tackling RL-specific problems as some areas
for future work on methods for generalisation, and we recommend building
benchmarks in underexplored problem settings such as offline RL generalisation
and reward-function variation.
Authors
Robert Kirk, Amy Zhang, Edward Grefenstette, Tim Rocktäschel