Abstract | ||
---|---|---|
We present PEM, the first fully automatic metric to evaluate the quality of paraphrases, and consequently, that of paraphrase generation systems. Our metric is based on three criteria: adequacy, fluency, and lexical dissimilarity. The key component in our metric is a robust and shallow semantic similarity measure based on pivot language N-grams that allows us to approximate adequacy independently of lexical similarity. Human evaluation shows that PEM achieves high correlation with human judgments. |
Year | Venue | Keywords |
---|---|---|
2010 | EMNLP | parallel text,pivot language n-grams,paraphrase evaluation,lexical similarity,lexical dissimilarity,approximate adequacy,human evaluation,high correlation,paraphrase generation system,key component,shallow semantic similarity measure,human judgment |
Field | DocType | Volume |
Semantic similarity,Lexical similarity,Pivot language,Computer science,Fluency,Paraphrase,Artificial intelligence,Natural language processing,Machine learning | Conference | D10-1 |
Citations | PageRank | References |
18 | 0.82 | 34 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
chang liu | 1 | 87 | 6.78 |
Daniel Dahlmeier | 2 | 460 | 29.67 |
Hwee Tou Ng | 3 | 4092 | 300.40 |