Title | ||
---|---|---|
Empirical Study Of Proxtone And Proxtone+ For Fast Learning Of Large Scale Sparse Models |
Abstract | ||
---|---|---|
PROXTONE is a novel and fast method for the optimization of large scale non-smooth convex problems [1]. In this work, we try to use the PROXTONE method in solving large scale non-smooth non-convex problems, for example training of sparse deep neural networks (sparse DNN) or sparse convolutional neural networks (sparse CNN) for embedded or mobile device. PROXTONE converges much faster than first order methods, while first order methods are easy to derive and control the sparseness of the solutions. Thus in some applications, in order to train sparse models fast, we propose to combine the merits of both methods, that is we use PROXTONE in the first several epochs to reach the neighborhood of an optimal solution, and then use the first order method to explore the possibility of sparsity in the following training. We call such method PROXTONE plus (PROXTONE+). Both PROXTONE and PROXTONE+ are tested in our experiments, and which demonstrate both methods improved convergence speed twice as fast at least on diverse sparse model learning problems, and at the same time reduce the size to 0.5% for DNN models. The source of all the algorithms is available upon request. |
Year | DOI | Venue |
---|---|---|
2016 | 10.1109/ICSP.2016.7877991 | PROCEEDINGS OF 2016 IEEE 13TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP 2016) |
Field | DocType | Volume |
Convergence (routing),Computer science,First order,Convolutional neural network,Sparse approximation,Theoretical computer science,Mobile device,Artificial intelligence,Artificial neural network,Convex optimization,Empirical research,Machine learning | Journal | abs/1604.05024 |
ISSN | Citations | PageRank |
2164-5221 | 0 | 0.34 |
References | Authors | |
11 | 2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Ziqiang Shi | 1 | 2 | 2.45 |
Rujie Liu | 2 | 147 | 15.49 |