Minibatch Estimation of Hyperparameters for the Process
Gaussian Process Inference Using Mini-batch Stochastic Gradient Descent: Convergence Guarantees and Empirical Benefits
We present a theoretical study of hyperparameter estimation for the gaussian process (gp) using minibatch stochastic gradient descent (sgd).
We prove minibatch sgd converges to a critical point of the full log-likelihood loss function and recovers model hyperparameters with rate for iterations, up to a statistical error term depending on the minibatch size.
Our theoretical guarantees hold provided that the kernel functions exhibit exponential or polynomial eigendecay which is satisfied by a wide range of kernels commonly used in gps.
Numerical studies on both simulated and real datasets demonstrate that minibatch sgd has better generalization over state-of-the-art methods while reducing the computational burden and opening a new, previously unexplored, data size regime for gps.
Authors
Hao Chen, Lili Zheng, Raed Al Kontar, Garvesh Raskutti