Title
Stochastic gradient descent with variance reduction technique.
Abstract
Gradient descent is prevalent for large scale optimization problems in machine learning, especially its major role is computing and correcting the connection strength of neural network in deep learning. However, choosing a proper learning rate for SGD can be difficult. A too small rate may lead to painfully slow convergence, while too large one would hinder convergence. In this paper, we present a novel variance reduction technique which applies the moving average of gradient termed SMVRG. SMVRG can take a large learning rate by using variance reduction technique. And, we only need to preserve current gradient and the previous average gradient. Our method is employed to Long Short-Term Memory(LSTM). The experiment on two data sets, the IMDB (movie reviews) and SemEval-2016 (sentiment analysis in twitter) shows our method can improve the results significantly.
Year
DOI
Venue
2018
10.3233/WEB-180386
WEB INTELLIGENCE
Keywords
Field
DocType
Learning rate,neural network,moving average,LSTM
Data mining,Stochastic gradient descent,Computer science,Algorithm,Variance reduction
Journal
Volume
Issue
ISSN
16
3
2405-6456
Citations 
PageRank 
References 
0
0.34
6
Authors
4
Name
Order
Citations
PageRank
Jinjing Zhang102.03
Fei Hu233.78
Xiaofei Xu340870.26
Li Li468.90