Abstract | ||
---|---|---|
Transformers have achieved new heights modeling natural language as a sequence of text tokens. However, in many real world scenarios, textual data inherently exhibits structures beyond a linear sequence such as trees and graphs; many tasks require reasoning with evidence scattered across multiple pieces of texts. This paper presents Transformer-XH, which uses eXtra Hop attention to enable intrinsic modeling of structured texts in a fully data-driven way. Its new attention mechanism naturally “hops” across the connected text sequences in addition to attending over tokens within each sequence. Thus, Transformer-XH better conducts joint multi-evidence reasoning by propagating information between documents and constructing global contextualized representations. On multi-hop question answering, Transformer-XH leads to a simpler multi-hop QA system which outperforms previous state-of-the-art on the HotpotQA FullWiki setting. On FEVER fact verification, applying Transformer-XH provides state-of-the-art accuracy and excels on claims whose verification requires multiple evidence. |
Year | Venue | Keywords |
---|---|---|
2020 | ICLR | Transformer-XH, multi-hop QA, fact verification, extra hop attention, structured modeling |
DocType | Citations | PageRank |
Conference | 0 | 0.34 |
References | Authors | |
0 | 6 |
Name | Order | Citations | PageRank |
---|---|---|---|
Chen Zhao | 1 | 14 | 4.36 |
Chen-Yan Xiong | 2 | 405 | 30.82 |
Corby Rosset | 3 | 1 | 1.03 |
Xia Song | 4 | 30 | 3.19 |
Paul N. Bennett | 5 | 1500 | 87.93 |
saurabh tiwary | 6 | 29 | 3.86 |