Title
An adaptive mechanism to achieve learning rate dynamically
Abstract
Gradient descent is prevalent for large-scale optimization problems in machine learning; especially it nowadays plays a major role in computing and correcting the connection strength of neural networks in deep learning. However, many gradient-based optimization methods contain more sensitive hyper-parameters which require endless ways of configuring. In this paper, we present a novel adaptive mechanism called adaptive exponential decay rate (AEDR). AEDR uses an adaptive exponential decay rate rather than a fixed and preconfigured one, and it can allow us to eliminate one otherwise tuning sensitive hyper-parameters. AEDR also can be used to calculate exponential decay rate adaptively by employing the moving average of both gradients and squared gradients over time. The mechanism is then applied to Adadelta and Adam; it reduces the number of hyper-parameters of Adadelta and Adam to only a single one to be turned. We use neural network of long short-term memory and LeNet to demonstrate how learning rate adapts dynamically. We show promising results compared with other state-of-the-art methods on four data sets, the IMDB (movie reviews), SemEval-2016 (sentiment analysis in twitter) (IMDB), CIFAR-10 and Pascal VOC-2012.
Year
DOI
Venue
2019
10.1007/s00521-018-3495-0
Neural Computing and Applications
Keywords
Field
DocType
Adaptive mechanism, Learning rate, Adaptive exponential decay rates, Gradient
Gradient descent,Square (algebra),Sentiment analysis,Exponential decay,Algorithm,Artificial intelligence,Deep learning,Artificial neural network,Moving average,Optimization problem,Machine learning,Mathematics
Journal
Volume
Issue
ISSN
31.0
10
1433-3058
Citations 
PageRank 
References 
0
0.34
23
Authors
7
Name
Order
Citations
PageRank
Jinjing Zhang100.34
Fei Hu233.78
Fei Hu333.78
Li Li4177.07
Xiaofei Xu540870.26
Zhanbo Yang601.01
Yanbin Chen700.34