Abstract | ||
---|---|---|
Advanced neural network models generally implement systems as multiple layers to model complex functions and capture complicated linguistic structures at different levels [1]. However, only the top layers of deep networks are leveraged in the subsequent process, which misses the opportunity to exploit the useful information embedded in other layers. In this work, we propose to expose all of these embedded signals with two types of mechanisms, namely deep connections and iterative routings. While deep connections allow better information and gradient flow across layers, iterative routings directly combine the layer representations to form a final output with iterative routing-by-agreement mechanism. Experimental results on both machine translation and language representation tasks demonstrate the effectiveness and universality of the proposed approaches, which indicates the necessity of exploiting deep representations for natural language processing tasks. While the two strategies individually boost performance, combining them can further improve performance. |
Year | DOI | Venue |
---|---|---|
2020 | 10.1016/j.neucom.2019.12.060 | Neurocomputing |
Keywords | DocType | Volume |
Natural language processing,Deep neural networks,Deep representations,Layer aggregation,Routing-by-agreement | Journal | 386 |
ISSN | Citations | PageRank |
0925-2312 | 0 | 0.34 |
References | Authors | |
29 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Zi-Yi Dou | 1 | 20 | 7.01 |
Xing Wang | 2 | 58 | 10.07 |
Shuming Shi | 3 | 620 | 58.27 |
Zhaopeng Tu | 4 | 518 | 39.95 |