Title
Parameter Re-Initialization through Cyclical Batch Size Schedules.
Abstract
Optimal parameter initialization remains a crucial problem for neural network training. A poor weight initialization may take longer to train and/or converge to sub-optimal solutions. Here, we propose a method of weight re-initialization by repeated annealing and injection of noise in the training process. We implement this through a cyclical batch size schedule motivated by a Bayesian perspective of neural network training. We evaluate our methods through extensive experiments on tasks in language modeling, natural language inference, and image classification. We demonstrate the ability of our method to improve language modeling performance by up to 7.91 perplexity and reduce training iterations by up to $61%$, in addition to its flexibility in enabling snapshot ensembling and use with adversarial training.
Year
Venue
DocType
2018
arXiv: Learning
Journal
Volume
Citations 
PageRank 
abs/1812.01216
0
0.34
References 
Authors
0
5
Name
Order
Citations
PageRank
Norman Mu100.34
Zhewei Yao23110.58
Amir Gholami36612.99
Kurt Keutzer45040801.67
Michael W. Mahoney53297218.10