Title
A Globalization-Semantic Matching Neural Network for Paraphrase Identification.
Abstract
Paraphrase identification (PI) aims at determining whether two natural language sentences roughly have identical meaning. PI has been conventionally formalized as a binary classification task and widely used in many talks such as text summarization, plagiarism detection, etc. The emergence of deep neural networks (DNNs) renovates and dominates the learning paradigm of PI, as DNNs do not rely on lexical nor syntactic knowledge of a language, unlike traditional methods. State-of-the-art DNNs-based approaches to PI mainly adopt multi-layer convolutional neural networks (CNNs) to model paraphrastic sentences, which could discover alignments of phrases with the same length (unigram-to-unigram, bigram-to-bigram, trigram-to-trigram, etc.) at each layer. However, paraphrasing phenomena globally exist at all levels of granularity between a pair of paraphrastic sentences, i.e., word-to-word, word-to-phrase, phrase-to-phrase, and even sentence-to-sentence. In this paper, we contribute a globalization-semantic matching neural network (GSMNN) paradigm which has been deployed in Baidu.com to tackle practical PI problems. Established on a weight-sharing single-layer CNN, GSMNN is composed of a multi-granular matching layer with the attention mechanism and a sentence-level matching layer. These layers are designed to capture essentially all phenomena of semantic matching. Evaluations are conducted on a public large-scale dataset for PI: Quora-QP which contains more than 400,000 paraphrasing and non-paraphrasing question pairs from Quora.com. Experimental results show that GSMNN outperforms the state-of-the-art model by a substantial margin.
Year
DOI
Venue
2018
10.1145/3269206.3272004
CIKM
Keywords
Field
DocType
Paraphrase identification, CNN, semantic matching
Automatic summarization,Plagiarism detection,Information retrieval,Computer science,Convolutional neural network,Paraphrase,Natural language,Natural language processing,Artificial intelligence,Artificial neural network,Syntax,Semantic matching
Conference
ISBN
Citations 
PageRank 
978-1-4503-6014-2
2
0.39
References 
Authors
21
5
Name
Order
Citations
PageRank
Miao Fan114016.04
Wutao Lin220.39
Yue Feng35516.15
Mingming Sun4246.27
Ping Li51672127.72