Abstract | ||
---|---|---|
Recently it was shown that linguistic structure predicted by a supervised parser can be beneficial for neural machine translation (NMT). In this work we investigate a more challenging setup: we incorporate sentence structure as a latent variable in a standard NMT encoder-decoder and induce it in such a way as to benefit the translation task. We consider German-English and Japanese-English translation benchmarks and observe that when using RNN encoders the model makes no or very limited use of the structure induction apparatus. In contrast, CNN and word-embedding-based encoders rely on latent graphs and force them to encode useful, potentially long-distance, dependencies. |
Year | Venue | DocType |
---|---|---|
2019 | arXiv: Computation and Language | Journal |
Volume | Citations | PageRank |
abs/1901.06436 | 1 | 0.35 |
References | Authors | |
0 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Joost Bastings | 1 | 48 | 2.92 |
Wilker Aziz | 2 | 70 | 10.24 |
Ivan Titov | 3 | 1484 | 81.98 |
Khalil Sima'an | 4 | 443 | 50.32 |