Title
A Semantic Concept Based Unknown Words Processing Method in Neural Machine Translation.
Abstract
The problem of unknown words in neural machine translation (NMT), which not only affects the semantic integrity of the source sentences but also adversely affects the generating of the target sentences. The traditional methods usually replace the unknown words according to the similarity of word vectors, these approaches are difficult to deal with rare words and polysemous words. Therefore, this paper proposes a new method of unknown words processing in NMT based on the semantic concept of the source language. Firstly, we use the semantic concept of source language semantic dictionary to find the candidate in-vocabulary words. Secondly, we propose a method to calculate the semantic similarity by integrating the source language model and the semantic concept network, to obtain the best replacement word. Experiments on English to Chinese translation task demonstrate that our proposed method can achieve more than 2.6 BLEU points over the conventional NMT method. Compared with the traditional method based on word vector similarity, our method can also obtain an improvement by nearly 0.8 BLEU points.
Year
DOI
Venue
2017
10.1007/978-3-319-73618-1_20
Lecture Notes in Artificial Intelligence
Keywords
DocType
Volume
NMT,Unknown words,Semantic dictionary,Semantic concept
Conference
10619
ISSN
Citations 
PageRank 
0302-9743
0
0.34
References 
Authors
0
5
Name
Order
Citations
PageRank
Shaotong Li101.35
Jin An Xu21524.50
Guoyi Miao301.69
Yujie Zhang425152.63
Yufeng Chen53816.55