Early Stopping is Nonparametric Variational Inference. - Citegraph

Paper Info

Title
Early Stopping is Nonparametric Variational Inference.

Abstract
We show that unconverged stochastic gradient descent can be interpreted as a procedure that samples from a nonparametric variational approximate posterior distribution. This distribution is implicitly defined as the transformation of an initial distribution by a sequence of optimization updates. By tracking the change in entropy over this sequence of transformations during optimization, we form a scalable, unbiased estimate of the variational lower bound on the log marginal likelihood. We can use this bound to optimize hyperparameters instead of using cross-validation. This Bayesian interpretation of SGD suggests improved, overfitting-resistant optimization procedures, and gives a theoretical foundation for popular tricks such as early stopping and ensembling. We investigate the properties of this marginal likelihood estimator on neural network models.

Year	Venue	Field
2015	CoRR	Early stopping,Stochastic gradient descent,Mathematical optimization,Hyperparameter,Upper and lower bounds,Marginal likelihood,Posterior probability,Nonparametric statistics,Artificial intelligence,Machine learning,Mathematics,Estimator
DocType	Volume	Citations
Journal	abs/1504.01344	15
PageRank	References	Authors
0.69	12	3

Authors (3 rows)

Cited by (15 rows)

References (12 rows)

Name	Order	Citations	PageRank
Dougal Maclaurin	1	255	9.76
David K. Duvenaud	2	629	32.63
Ryan P. Adams	3	15	2.04

1