On the Convergence of Stochastic Gradient Descent with Adaptive Stepsizes - 42Papers