Title
Towards a Theoretical Understanding of Batch Normalization.
Abstract
Normalization techniques such as Batch Normalization have been applied very successfully for training deep neural networks. Yet, despite its apparent empirical benefits, the reasons behind the success of Batch Normalization are mostly hypothetical. We thus aim to provide a more thorough theoretical understanding from an optimization perspective. Our main contribution towards this goal is the identification of various problem instances in the realm of machine learning where, under certain assumptions, Batch Normalization can provably accelerate optimization with gradient-based methods. We thereby turn Batch Normalization from an effective practical heuristic into a provably converging algorithm for these settings. Furthermore, we substantiate our analysis with empirical evidence that suggests the validity of our theoretical results in a broader context.
Year
Venue
Field
2018
arXiv: Machine Learning
Normalization (statistics),Computer science,Biochemical engineering
DocType
Volume
Citations 
Journal
abs/1805.10694
6
PageRank 
References 
Authors
0.41
0
6
Name
Order
Citations
PageRank
Jonas Moritz Kohler1231.72
Hadi Daneshmand2101.16
Aurelien Lucchi3241989.45
Ming Zhou44262251.74
Klaus Neymeyr5110.98
Thomas Hofmann6100641001.83