Title
Local Geometry of One-Hidden-Layer Neural Networks for Logistic Regression.
Abstract
We study the local geometry of a one-hidden-layer fully-connected neural network where the training samples are generated from a multi-neuron logistic regression model. We prove that under Gaussian input, the empirical risk function employing quadratic loss exhibits strong convexity and smoothness uniformly in a local neighborhood of the ground truth, for a class of smooth activation functions satisfying certain properties, including sigmoid and tanh, as soon as the sample complexity is sufficiently large. This implies that if initialized in this neighborhood, gradient descent converges linearly to a critical point that is provably close to the ground truth without requiring a fresh set of samples at each iteration. This significantly improves upon prior results on learning shallow neural networks with multiple neurons. To the best of our knowledge, this is the first global convergence guarantee for one-hidden-layer neural networks using gradient descent over the empirical risk function without resampling at the near-optimal sampling and computational complexity.
Year
Venue
Field
2018
arXiv: Machine Learning
Convergence (routing),Gradient descent,Convexity,Ground truth,Geometry,Artificial neural network,Resampling,Mathematics,Sigmoid function,Computational complexity theory
DocType
Volume
Citations 
Journal
abs/1802.06463
5
PageRank 
References 
Authors
0.41
21
3
Name
Order
Citations
PageRank
Haoyu Fu191.81
Yuejie Chi272056.67
Yingbin Liang31646147.64