Using backpropagation to compute gradients of objective functions for
optimization has remained a mainstay of machine learning. Backpropagation, or
reverse-mode differentiation, is a special case within the general family of
automatic differentiation algorithms that also includes the forward mode. We
present a method to compute gradients based solely on the directional
derivative that one can compute exactly and efficiently via the forward mode.
We call this formulation the forward gradient, an unbiased estimate of the
gradient that can be evaluated in a single forward run of the function,
entirely eliminating the need for backpropagation in gradient descent. We
demonstrate forward gradient descent in a range of problems, showing
substantial savings in computation and enabling training up to twice as fast in
some cases.
Authors
Atılım Güneş Baydin, Barak A. Pearlmutter, Don Syme, Frank Wood, Philip Torr