Title
Towards Better Modeling Hierarchical Structure for Self-Attention with Ordered Neurons
Abstract
Recent studies have shown that a hybrid of self-attention networks (SANs) and recurrent neural networks (RNNs) outperforms both individual architectures, while not much is known about why the hybrid models work. With the belief that modeling hierarchical structure is an essential complementary between SANs and RNNs, we propose to further enhance the strength of hybrid models with an advanced variant of RNNs - Ordered Neurons LSTM (ON-LSTM), which introduces a syntax-oriented inductive bias to perform tree-like composition. Experimental results on the benchmark machine translation task show that the proposed approach outperforms both individual architectures and a standard hybrid model. Further analyses on targeted linguistic evaluation and logical inference tasks demonstrate that the proposed approach indeed benefits from a better modeling of hierarchical structure.
Year
DOI
Venue
2019
10.18653/v1/D19-1135
EMNLP/IJCNLP (1)
DocType
Volume
Citations 
Conference
D19-1
0
PageRank 
References 
Authors
0.34
0
5
Name
Order
Citations
PageRank
Jie Hao117538.33
Xing Wang25810.07
Shuming Shi362058.27
Jinfeng Zhang421.05
Zhaopeng Tu551839.95