Title
Adaptive Variance Reducing for Stochastic Gradient Descent.
Abstract
Variance Reducing (VR) stochastic methods are fast-converging alternatives to the classical Stochastic Gradient Descent (SGD) for solving large-scale regularized finite sum problems, especially when a highly accurate solution is required. One critical step in VR is the function sampling. State-of-the-art VR algorithms such as SVRG and SAGA, employ either Uniform Probability (UP) or Importance Probability (IP), which is deficient in reducing the variance and hence leads to suboptimal convergence rate. In this paper, we propose a novel sampling scheme that explicitly computes some Adaptive Probability (AP) at each iteration. Analysis shows that, equipped with AP, both SVRG and SAGA yield provably better convergence rate than the ones with UP or IP, which is confirmed in experiments. Additionally, to cut down the per iteration computation load, an efficient variant is proposed by utilizing AP periodically, whose performance is empirically validated.
Year
Venue
Field
2016
IJCAI
Mathematical optimization,Stochastic gradient descent,Computer science,Sampling (statistics),Rate of convergence,Computation,Sampling scheme
DocType
Citations 
PageRank 
Conference
1
0.36
References 
Authors
10
4
Name
Order
Citations
PageRank
Zebang Shen1179.36
Hui Qian25913.26
Tengfei Zhou3225.08
Tongzhou Mu411.71