Title
Using NMT with Grammar Information and Self-taught Mechanism in Translating Chinese Symptom and Disease Terminologies.
Abstract
Neural Machine Translation (NMT) based on the encoder-decoder architecture is a proposed approach to machine translation, and has achieved promising results comparable to those of traditional approaches such as statistical machine translation. However, a NMT system usually needs a large number of parallel corpora to train the model, which is difficult to get in some specific areas, e.g. symptom and disease terminologies. In this paper, we propose two approaches to make full use of the source-side monolingual data to make up the lack of parallel corpora. The first approach uses part-of-speech of source-side symptom and disease terminologies to get their grammar information. The second approach employs a self-taught learning algorithm to get more synthetic parallel data. The proposed NMT model obtains significant improvements in translating symptom and disease terminologies from Chinese into English. Improvements up to 2.13 BLEU points are gained, compared with the NMT baseline system.
Year
DOI
Venue
2017
10.1007/978-3-319-73618-1_65
Lecture Notes in Artificial Intelligence
Keywords
DocType
Volume
Neural Machine Translation,Seq2Seq model,Source-side monolingual data,Symptom and Disease terminologies
Conference
10619
ISSN
Citations 
PageRank 
0302-9743
0
0.34
References 
Authors
0
3
Name
Order
Citations
PageRank
Lu Zeng161.94
Qi Wang200.68
Lingfei Zhang300.34