Accelerated Quality-Diversity for Robotics through Massive Parallelism
Bryan Lim, Maxime Allard, Luca Grillotti, Antoine Cully
Quality-Diversity (QD) algorithms are a well-known approach to generate large
collections of diverse and high-quality policies. However, QD algorithms are
also known to be data-inefficient, requiring large amounts of computational
resources and are slow when used in practice for robotics tasks. Policy
evaluations are already commonly performed in parallel to speed up QD
algorithms but have limited capabilities on a single machine as most physics
simulators run on CPUs. With recent advances in simulators that run on
accelerators, thousands of evaluations can performed in parallel on single
GPU/TPU. In this paper, we present QDax, an implementation of MAP-Elites which
leverages massive parallelism on accelerators to make QD algorithms more
accessible. We first demonstrate the improvements on the number of evaluations
per second that parallelism using accelerated simulators can offer. More
importantly, we show that QD algorithms are ideal candidates and can scale with
massive parallelism to be run at interactive timescales. The increase in
parallelism does not significantly affect the performance of QD algorithms,
while reducing experiment runtimes by two factors of magnitudes, turning days
of computation into minutes. These results show that QD can now benefit from
hardware acceleration, which contributed significantly to the bloom of deep
learning.