Abstract | ||
---|---|---|
Using unitary (instead of general) matrices in artificial neural networks (ANNs) is a promising way to solve the gradient explosion/vanishing problem, as well as to enable ANNs to learn long-term correlations in the data. This approach appears particularly promising for Recurrent Neural Networks (RNNs). In this work, we present a new architecture for implementing an Efficient Unitary Neural Network (EUNNs); its main advantages can be summarized as follows. Firstly, the representation capacity of the unitary space in an EUNN is fully tunable, ranging from a subspace of SU(N) to the entire unitary space. Secondly, the computational complexity for training an EUNN is merely $mathcal{O}(1)$ per parameter. Finally, we test the performance of EUNNs on the standard copying task, the pixel-permuted MNIST digit recognition benchmark as well as the Speech Prediction Test (TIMIT). We find that our architecture significantly outperforms both other state-of-the-art unitary RNNs and the LSTM architecture, in terms of the final performance and/or the wall-clock training speed. EUNNs are thus promising alternatives to RNNs and LSTMs for a wide variety of applications. |
Year | Venue | DocType |
---|---|---|
2017 | ICML | Conference |
Volume | Citations | PageRank |
abs/1612.05231 | 1 | 0.34 |
References | Authors | |
0 | 7 |
Name | Order | Citations | PageRank |
---|---|---|---|
Li Jing | 1 | 16 | 5.47 |
Yichen Shen | 2 | 16 | 3.09 |
Tena Dubcek | 3 | 1 | 0.34 |
John Peurifoy | 4 | 13 | 0.94 |
Scott A. Skirlo | 5 | 1 | 1.02 |
Max Tegmark | 6 | 13 | 0.94 |
Marin Soljacic | 7 | 19 | 5.16 |