Title | ||
---|---|---|
Machine Learning Models for Paraphrase Identification and its Applications on Plagiarism Detection |
Abstract | ||
---|---|---|
Paraphrase Identification or Natural Language Sentence Matching (NLSM) is one of the important and challenging tasks in Natural Language Processing where the task is to identify if a sentence is a paraphrase of another sentence in a given pair of sentences. Paraphrase of a sentence conveys the same meaning but its structure and the sequence of words varies. It is a challenging task as it is difficult to infer the proper context about a sentence given its short length. Also, coming up with similarity metrics for the inferred context of a pair of sentences is not straightforward as well. Whereas, its applications are numerous. This work explores various machine learning algorithms to model the task and also applies different input encoding scheme. Specifically, we created the models using Logistic Regression, Support Vector Machines, and different architectures of Neural Networks. Among the compared models, as expected, Recurrent Neural Network (RNN) is best suited for our paraphrase identification task. Also, we propose that Plagiarism detection is one of the areas where Paraphrase Identification can be effectively implemented. |
Year | DOI | Venue |
---|---|---|
2019 | 10.1109/ICBK.2019.00021 | 2019 IEEE International Conference on Big Knowledge (ICBK) |
Keywords | Field | DocType |
Paraphrase Identification, Machine learning, Long Short Term Memory Networks, NLP | Plagiarism detection,Computer science,Support vector machine,Recurrent neural network,Paraphrase,Artificial intelligence,Natural language sentence,Artificial neural network,Sentence,Machine learning,Encoding (memory) | Conference |
ISBN | Citations | PageRank |
978-1-7281-4608-9 | 1 | 0.38 |
References | Authors | |
0 | 13 |
Name | Order | Citations | PageRank |
---|---|---|---|
Ethan Hunt | 1 | 1 | 0.38 |
Binay Dahal | 2 | 1 | 0.38 |
Justin Zhan | 3 | 1 | 0.72 |
Laxmi Gewali | 4 | 72 | 16.39 |
Paul Y. Oh | 5 | 289 | 51.08 |
Ritvik Janamsetty | 6 | 1 | 0.38 |
Chanana Kinares | 7 | 1 | 0.38 |
Chanel Koh | 8 | 1 | 0.38 |
Alexis Sanchez | 9 | 1 | 0.38 |
Felix Zhan | 10 | 1 | 0.38 |
Murat Özdemir | 11 | 1 | 0.38 |
Shabnam Waseem | 12 | 1 | 0.38 |
Osman Yolcu | 13 | 1 | 0.38 |