Abstract | ||
---|---|---|
We introduce a novel top-down end-to-end formulation of document-level discourse parsing in the Rhetorical Structure Theory (RST) framework. In this formulation, we consider discourse parsing as a sequence of splitting decisions at token boundaries and use a seq2seq network to model the splitting decisions. Our framework facilitates discourse parsing from scratch without requiring discourse segmentation as a prerequisite; rather, it yields segmentation as part of the parsing process. Our unified parsing model adopts a beam search to decode the best tree structure by searching through a space of high-scoring trees. With extensive experiments on the standard English RST discourse treebank, we demonstrate that our parser outperforms existing methods by a good margin in both end-to-end parsing and parsing with gold segmentation. More importantly, it does so without using any handcrafted features, making it faster and easily adaptable to new languages and domains. |
Year | Venue | DocType |
---|---|---|
2021 | NAACL-HLT | Conference |
Citations | PageRank | References |
0 | 0.34 | 0 |
Authors | ||
4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Thanh-Tung Nguyen | 1 | 0 | 0.34 |
Phi Xuan Nguyen | 2 | 0 | 2.37 |
Shafiq R. Joty | 3 | 560 | 56.72 |
Minh Nhut Nguyen | 4 | 1837 | 112.04 |