Marginal Uncertainty in Linear Models and Deep Neural Networks

A Bayesian Perspective on Training Speed and Model Selection

A measure of a model s training speed can be used to estimate its marginal likelihood.This provides two major insights : first, that a measure of a model s training speed can be used to estimate its marginal likelihood.Second, that this measure, under certain conditions, predicts the relative weighting of models in linear modelcombinations trained to minimize a regression loss.We verify our results in model selection tasks for linear models and for the infinite-width limit of deep neural networks trained with stochastic gradient descent.Our results suggest a promising new direction towards explaining why neural networks trained with stochasticgradient descent are biased towards functions that generalize well.