Keep Up With Latest Trending Papers. Computer Science, AI and Machine Learning and more.Subscribe

Top Papers in Stochastic gradient descent

Share

RSGDA: A Randomized variant of the Epoch Gradient textitDescent Ascent Algorithm

Randomized Stochastic Gradient Descent Ascent

Read More...

Share

Scalable Estimation and Inference with Large-scale or Online Survival Data

With the rapid development of data collection and aggregation technologies in many scientific disciplines, it is becoming increasingly ubiquitous to conduct large-scale or online regression to analyze

More...

Share

Almostsure Convergence Rates of Gradient Descent

Accelerated Almost-Sure Convergence Rates for Nonconvex Stochastic Gradient Descent using Stochastic Learning Rates

Read More...

Share

Bookmarked byPate

Federated Accelerated Stochastic Gradient Descent

We propose Federated Accelerated Stochastic Gradient Descent (FedAc), a
principled acceleration of Federated Averaging (FedAvg, also known as Local
SGD) for distributed optimization. FedAc is the firs

More...

Share

Benign Underfitting of Stochastic Gradient Descent

We study to what extent may stochastic gradient descent (SGD) be understood
as a "conventional" learning rule that achieves generalization performance by
obtaining a good fit to training data. We cons

More...

Share

Stochastic gradient descent on Riemannian manifolds

Stochastic gradient descent is a simple approach to find the local minima of
a cost function whose evaluations are corrupted by noise. In this paper, we
develop a procedure extending stochastic gradie

More...

Share

Local Quadratic Convergence of Stochastic Gradient Descent with Adaptive Step Size

Establishing a fast rate of convergence for optimization methods is crucial
to their applicability in practice. With the increasing popularity of deep
learning over the past decade, stochastic gradien

More...

Share

Gradient Descent for Force-Directed Graph Scaling

Graph Drawing by Stochastic Gradient Descent

Read More...

Share

Global Convergence of the Gradient Descent Algorithm for a Class of One Hidden Layer Feed-Forward Neural Networks

Improved Overparametrization Bounds for Global Convergence of Stochastic Gradient Descent for Shallow Neural Networks

Read More...

Share

Scaling Sample Complexity in Neural Networks

Is Stochastic Gradient Descent Near Optimal?

Read More...

Share

Data Scouring for MAD Descent

Stochastic Gradient Descent without Full Data Shuffle

Read More...

Share

Analysis of Stochastic Gradient Descent in Continuous Time

Stochastic gradient descent is an optimisation method that combines classical
gradient descent with random subsampling within the target functional. In this
work, we introduce the stochastic gradient

More...

Share

Learning-rate optimization algorithms

Stochastic Learning Rate Optimization in the Stochastic Approximation and Online Learning Settings

Read More...

Share

Convergence Rates for Stochastic Approximation on a Boundary

We analyze the behavior of projected stochastic gradient descent focusing on
the case where the optimum is on the boundary of the constraint set and the
gradient does not vanish at the optimum. Here i

More...

Share

Stochastic Gradient Descent as Approximate Bayesian Inference

Stochastic Gradient Descent with a constant learning rate (constant SGD)
simulates a Markov chain with a stationary distribution. With this perspective,
we derive several new results. (1) We show that

More...

Share

The effective noise of Stochastic Gradient Descent

Stochastic Gradient Descent (SGD) is the workhorse algorithm of deep learning
technology. At each step of the training phase, a mini batch of samples is
drawn from the training dataset and the weights

More...

Share

Fluctuation-dissipation relations for stochastic gradient descent

The notion of the stationary equilibrium ensemble has played a central role
in statistical mechanics. In machine learning as well, training serves as
generalized equilibration that drives the probabil

More...

Share

Private Weighted Random Walk Stochastic Gradient Descent

We consider a decentralized learning setting in which data is distributed
over nodes in a graph. The goal is to learn a global model on the distributed
data without involving any central entity that n

More...

Share

Uniform Abductive and Function Properties of Gradient Descent

On Uniform Boundedness Properties of SGD and its Momentum Variants

Read More...

Share

Gradient Estimators for Categorical Data

Stochastic gradient descent with gradient estimator for categorical features

Read More...

Share

More

More