Title | ||
---|---|---|
Improved Learning in Convolutional Neural Networks with Shifted Exponential Linear Units (ShELUs) |
Abstract | ||
---|---|---|
The Exponential Linear Unit (ELU) has been proven to speed up learning and improve the classification performance over activation functions such as ReLU and Leaky ReLU for convolutional neural networks. The reasons behind the improved behavior are that ELU reduces the bias shift, it saturates for large negative inputs and it is continuously differentiable. However, it remains open whether ELU has the optimal shape and we address the quest for a superior activation function. We use a new formulation to tune a piecewise linear activation function during training, to investigate the above question, and learn the shape of the locally optimal activation function. With this tuned activation function, the classification performance is improved and the resulting, learned activation function shows to be ELU-shaped irrespective if it is initialized as a RELU, LReLU or ELU. Interestingly, the learned activation function does not exactly pass through the origin indicating that a shifted ELU-shaped activation function is preferable. This observation leads us to introduce the Shifted Exponential Linear Unit (ShELU) as a new activation function. Experiments on Cifar-100 show that the classification performance is further improved when using the ShELU activation function in comparison with ELU. The improvement is achieved when learning an individual bias shift for each neuron. |
Year | DOI | Venue |
---|---|---|
2018 | 10.1109/ICPR.2018.8545104 | 2018 24th International Conference on Pattern Recognition (ICPR) |
Keywords | Field | DocType |
locally optimal activation function,tuned activation function,classification performance,learned activation function,shifted ELU-shaped activation function,ShELU activation function,convolutional neural networks,Leaky ReLU,improved behavior,superior activation function,piecewise linear activation function,exponential linear units,improved learning,ShELUs | Histogram,Pattern recognition,Convolutional neural network,Activation function,Computer science,Algorithm,Artificial intelligence,Smoothness,Piecewise linear function,Exponential linear units,Speedup | Conference |
ISSN | ISBN | Citations |
1051-4651 | 978-1-5386-3789-0 | 0 |
PageRank | References | Authors |
0.34 | 1 | 2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Bertil Grelsson | 1 | 3 | 1.90 |
Michael Felsberg | 2 | 2419 | 130.29 |