Physically Embedded Planning Problems: New Challenges for Reinforcement Learning
Recent work in deep reinforcement learning (RL) has produced algorithms
capable of mastering challenging games such as Go, chess, or shogi. In these
works the RL agent directly observes the natural state of the game and controls
that state directly with its actions. However, when humans play such games,
they do not just reason about the moves but also interact with their physical
environment. They understand the state of the game by looking at the physical
board in front of them and modify it by manipulating pieces using touch and
fine-grained motor control. Mastering complicated physical systems with
abstract goals is a central challenge for artificial intelligence, but it
remains out of reach for existing RL algorithms. To encourage progress towards
this goal we introduce a set of physically embedded planning problems and make
them publicly available. We embed challenging symbolic tasks (Sokoban,
tic-tac-toe, and Go) in a physics engine to produce a set of tasks that require
perception, reasoning, and motor control over long time horizons. Although
existing RL algorithms can tackle the symbolic versions of these tasks, we find
that they struggle to master even the simplest of their physically embedded
counterparts. As a first step towards characterizing the space of solution to
these tasks, we introduce a strong baseline that uses a pre-trained expert game
player to provide hints in the abstract space to an RL agent's policy while
training it on the full sensorimotor control task. The resulting agent solves
many of the tasks, underlining the need for methods that bridge the gap between
abstract planning and embodied control.
Authors
Mehdi Mirza, Andrew Jaegle, Jonathan J. Hunt, Arthur Guez, Saran Tunyasuvunakool, Alistair Muldal, Théophane Weber, Peter Karkus, S ébastien Racanière, Lars Buesing, Timothy Lillicrap, Nicolas Heess