PKU Paraphrase Bank: A Sentence-Level Paraphrase Corpus for Chinese - Citegraph

Paper Info

Title
PKU Paraphrase Bank: A Sentence-Level Paraphrase Corpus for Chinese

Abstract
One of the main challenges of conducting research on paraphrase is the lack of large-scale, high-quality corpus, which is particularly serious for non-English investigations. In this paper, we present a simple and effective unsupervised learning model that is able to automatically extract high-quality sentence-level paraphrases from multiple Chinese translations of the same source texts. By applying this new model, we obtain a large-scale paraphrase corpus, which contains 509,832 pairs of paraphrased sentences. The quality of this new corpus is manually examined. Our new model is language-independent, meaning that such paraphrase corpora for other languages can be built in the same way.

Year	DOI	Venue
2019	10.1007/978-3-030-32233-5_63	Lecture Notes in Artificial Intelligence
Keywords	DocType	Volume
Paraphrase,Paraphrase extraction,Sentence embedding,Sentence similarity	Conference	11838
ISSN	Citations	PageRank
0302-9743	0	0.34
References	Authors
0	4

Authors (4 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Bowei Zhang	1	2	0.71
Weiwei Sun	2	0	0.34
Xiaojun Wan	3	1685	125.70
Zongming Guo	4	778	81.98

1