Abstract | ||
---|---|---|
Convolutional neural networks have become the main tools for processing two-dimensional data. They work well for images, yet convolutions have a limited receptive field that prevents its applications to more complex 2D tasks. We propose a new neural model, called Matrix Shuffle-Exchange network, that can efficiently exploit long-range dependencies in 2D data and has comparable speed to a convolutional neural network. It is derived from Neural Shuffle-Exchange network and has O(log N) layers and O(N<^>2 log N) total time and O(N<^>2) space complexity for processing a NxN data matrix. We show that the Matrix Shuffle-Exchange network is well-suited for algorithmic and logical reasoning tasks on matrices and dense graphs, exceeding convolutional and graph neural network baselines. Its distinct advantage is the capability of retaining full long-range dependency modelling when generalizing to larger instances - much larger than could be processed with models equipped with a dense attention mechanism. |
Year | DOI | Venue |
---|---|---|
2021 | 10.1109/IJCNN52387.2021.9533919 | 2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN) |
DocType | ISSN | Citations |
Conference | 2161-4393 | 0 |
PageRank | References | Authors |
0.34 | 0 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Emils Ozolins | 1 | 0 | 0.34 |
Karlis Freivalds | 2 | 22 | 4.44 |
Agris Sostaks | 3 | 23 | 9.96 |