Assessing Generalization of SGD via Disagreement - Citegraph

Paper Info

Title
Assessing Generalization of SGD via Disagreement

Abstract
We empirically show that the test error of deep networks can be estimated by simply training the same architecture on the same training set but with a different run of Stochastic Gradient Descent (SGD), and measuring the disagreement rate between the two networks on unlabeled test data. This builds on -- and is a stronger version of -- the observation in Nakkiran & Bansal '20, which requires the second run to be on an altogether fresh training set. We further theoretically show that this peculiar phenomenon arises from the \emph{well-calibrated} nature of \emph{ensembles} of SGD-trained models. This finding not only provides a simple empirical measure to directly predict the test error using unlabeled test data, but also establishes a new conceptual connection between generalization and calibration.

Year	Venue	Keywords
2022	International Conference on Learning Representations (ICLR)	Generalization,Deep Learning,Empirical Phenomenon,Accuracy Estimation,Stochastic Gradient Descent
DocType	Citations	PageRank
Conference	0	0.34
References	Authors
0	4

Authors (4 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Yiding Jiang	1	3	2.41
Vaishnavh Nagarajan	2	0	1.35
Christina Baek	3	0	0.34
J. Zico Kolter	4	1270	84.23

1