Abstract | ||
---|---|---|
Previous work on paraphrase extraction using parallel or comparable corpora has generally not considered the documents' discourse structure as a useful information source. We propose a novel method for collecting paraphrases relying on the sequential event order in the discourse, using multiple sequence alignment with a semantic similarity measure. We show that adding discourse information boosts the performance of sentence-level paraphrase acquisition, which consequently gives a tremendous advantage for extracting phrase-level paraphrase fragments from matched sentences. Our system beats an informed baseline by a margin of 50%. |
Year | Venue | Keywords |
---|---|---|
2012 | EMNLP-CoNLL | useful information source,discourse information,comparable corpus,paraphrase extraction,novel method,multiple sequence alignment,informed baseline,phrase-level paraphrase fragment,discourse structure,sentence-level paraphrase acquisition |
Field | DocType | Volume |
Semantic similarity,Computer science,Paraphrase,Artificial intelligence,Natural language processing,Multiple sequence alignment,Discourse structure | Conference | D12-1 |
Citations | PageRank | References |
9 | 0.48 | 33 |
Authors | ||
2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Michaela Regneri | 1 | 143 | 7.44 |
Rui Wang | 2 | 20 | 2.47 |