Title
Annealed gradient descent for deep learning
Abstract
In this paper, we propose a novel annealed gradient descent (AGD) algorithm for deep learning. AGD optimizes a sequence of gradually improving smoother mosaic functions that approximate the original non-convex objective function according to an annealing schedule during optimization process. We present a theoretical analysis on AGD’s convergence properties and learning speed, as well as use some visualization methods to show its advantages. The proposed AGD algorithm is applied to learn both deep neural networks (DNNs) and Convolutional Neural Networks (CNNs) for variety of tasks includes image recognition and speech recognition. Experimental results on several widely-used databases, such as Switchboard, CIFAR-10 and Pascal VOC 2012, show that AGD yields better classification accuracy than SGD, and obviously accelerates the training speed of DNNs and CNNs.
Year
DOI
Venue
2020
10.1016/j.neucom.2019.11.021
Neurocomputing
Keywords
Field
DocType
Gradient descent,Deep learning,DNNs,CNNs
Convergence (routing),Gradient descent,Pattern recognition,Convolutional neural network,Visualization,Artificial intelligence,Deep learning,Mathematics,Deep neural networks
Journal
Volume
ISSN
Citations 
380
0925-2312
1
PageRank 
References 
Authors
0.35
0
5
Name
Order
Citations
PageRank
Hengyue Pan183.84
Xin Niu25611.39
Rongchun Li36714.65
Yong Dou463289.67
Hui Jiang51493113.16