Accelerating Primal-dual Methods for Regularized Markov Decision Processes
Haoya Li, Hsiang-fu Yu, Lexing Ying, Inderjit Dhillon
Entropy regularized Markov decision processes have been widely used in
reinforcement learning. This paper is concerned with the primal-dual
formulation of the entropy regularized problems. Standard first-order methods
suffer from slow convergence due to the lack of strict convexity and concavity.
To address this issue, we first introduce a new quadratically convexified
primal-dual formulation. The natural gradient ascent descent of the new
formulation enjoys global convergence guarantee and exponential convergence
rate. We also propose a new interpolating metric that further accelerates the
convergence significantly. Numerical results are provided to demonstrate the
performance of the proposed methods under multiple settings.