Title
Empirical Study Of Proxtone And Proxtone+ For Fast Learning Of Large Scale Sparse Models
Abstract
PROXTONE is a novel and fast method for the optimization of large scale non-smooth convex problems [1]. In this work, we try to use the PROXTONE method in solving large scale non-smooth non-convex problems, for example training of sparse deep neural networks (sparse DNN) or sparse convolutional neural networks (sparse CNN) for embedded or mobile device. PROXTONE converges much faster than first order methods, while first order methods are easy to derive and control the sparseness of the solutions. Thus in some applications, in order to train sparse models fast, we propose to combine the merits of both methods, that is we use PROXTONE in the first several epochs to reach the neighborhood of an optimal solution, and then use the first order method to explore the possibility of sparsity in the following training. We call such method PROXTONE plus (PROXTONE+). Both PROXTONE and PROXTONE+ are tested in our experiments, and which demonstrate both methods improved convergence speed twice as fast at least on diverse sparse model learning problems, and at the same time reduce the size to 0.5% for DNN models. The source of all the algorithms is available upon request.
Year
DOI
Venue
2016
10.1109/ICSP.2016.7877991
PROCEEDINGS OF 2016 IEEE 13TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP 2016)
Field
DocType
Volume
Convergence (routing),Computer science,First order,Convolutional neural network,Sparse approximation,Theoretical computer science,Mobile device,Artificial intelligence,Artificial neural network,Convex optimization,Empirical research,Machine learning
Journal
abs/1604.05024
ISSN
Citations 
PageRank 
2164-5221
0
0.34
References 
Authors
11
2
Name
Order
Citations
PageRank
Ziqiang Shi122.45
Rujie Liu214715.49