We revisit the asymptotically optimal quantum linear system solver of childs, kothari, and somma from the perspective of convex optimization, and in particular gradient descent-type algorithms.
We first show how the asymptotically optimal quantum linear system solver of childs, kothari, and somma is related to the gradient descentalgorithm on the convex function their linear system solver is based on a truncation in the chebyshev basis of the polynomial (in that maps the initial solution to the iterate in the basic gradient descent algorithm).
Many machine learning problems encode their data as a matrix with a possibly very large number of rows and columns.
In several applications like neuroscience, image compression or deep reinforcement learning, the principalsubspace of such a matrix provides a useful, low-dimensional representation of individual data.
We propose a new stochastic gradient descent algorithm for finding the global
optimizer of nonconvex optimization problems, referred to here as "AdaVar". A
key component in the algorithm is the adapti
The gradient descent approach is the key ingredient in variational quantum
algorithms and machine learning tasks, which is an optimization algorithm for
finding a local minimum of an objective functio
It is known that when the statistical models are singular, i.e., the Fisher
information matrix at the true parameter is degenerate, the fixed step-size
gradient descent algorithm takes polynomial numb
We propose a neural network approach to model general interaction dynamics
and an adjoint based stochastic gradient descent algorithm to calibrate its
parameters. The parameter calibration problem is
We study the statistical and computational complexities of the Polyak step
size gradient descent algorithm under generalized smoothness and Lojasiewicz
conditions of the population loss function, name
Stochastic gradient descent is a simple approach to find the local minima of
a cost function whose evaluations are corrupted by noise. In this paper, we
develop a procedure extending stochastic gradie
Ptychography is a promising phase retrieval technique for label-free
quantitative phase imaging. Recent advances in phase retrieval algorithms
witnessed the development of spectral methods, in order t
We develop a variational algorithm for estimating escape (least improbable or first passage) paths for a generic stochastic chemical reaction network that exhibits multiple fixed points.
The design of our algorithm is such that it is independent of the underlying dimensionality of the system, the discretization control parameters are updatedtowards the continuum limit, and there is an easy-to-calculate measure for the correctness of its solution.
This paper studies the statistical model of the non-centered mixture of
scaled Gaussian distributions (NC-MSG). Using the Fisher-Rao information
geometry associated to this distribution, we derive a R
Information geometry applies concepts in differential geometry to probability
and statistics and is especially useful for parameter estimation in exponential
families where parameters are known to lie
Gradient descent finds a global minimum in training deep neural networks
despite the objective function being non-convex. The current paper proves
gradient descent achieves zero training loss in polyn
We investigate uniform boundedness properties of iterates and function values along the trajectories of the stochastic gradient descent algorithm and its important momentum variant.
Under smoothness and of the loss function, we show that broad families of step-sizes, including the widely used step-decay and cosine with (or without) restart step-sizes, result in uniformlybounded iterates and function values.
Decentralized stochastic optimization is the basic building block of modern
collaborative machine learning, distributed estimation and control, and
large-scale sensing. Since involved data usually con
Large scale nonlinear classification is a challenging task in the field of
support vector machine. Online random Fourier feature map algorithms are very
important methods for dealing with large scale
We present a coupled system of ODEs which, when discretized with a constant
time step/learning rate, recovers Nesterov's accelerated gradient descent
algorithm. The same ODEs, when discretized with a
We study the convergence properties of the deterministic score-based method to sample from named kernel stein discretization (ksd), which uses a set of particles to approximate a target probability distribution on known up to a normalization constant.
Remarkably, owing to a tractable loss function, ksddescent can leverage robust parameter-free optimization schemes such as the l-bfgs.
In this paper, we consider the problem of phase retrieval, which consists of
recovering an $n$-dimensional real vector from the magnitude of its $m$ linear
measurements. We propose a mirror descent (o