Title
Improved Spoken Uyghur Segmentation for Neural Machine Translation
Abstract
To increase vocabulary overlap in spoken Uyghur neural machine translation (NMT), we propose a novel method to enhance the common used subword units based segmentation method. In particular, we apply a log-linear model as the main framework and integrate several features such as subword, morphological information, bilingual word alignment and monolingual language model into it. Experimental results show that spoken Uyghur segmentation with our proposed method improves the performance of the spoken Uyghur-Chinese NMT significantly (yield up to 1.52 BLEU improvements).
Year
DOI
Venue
2018
10.1109/ICTAI.2018.00018
2018 IEEE 30th International Conference on Tools with Artificial Intelligence (ICTAI)
Keywords
Field
DocType
spoken Uyghur segmentation,neural machine translation,BPE,morphologically-rich,low-resource language
Task analysis,Segmentation,Computer science,Machine translation,Feature extraction,Natural language processing,Artificial intelligence,Vocabulary,Language model,Machine learning
Conference
ISSN
ISBN
Citations 
1082-3409
978-1-5386-7450-5
0
PageRank 
References 
Authors
0.34
4
5
Name
Order
Citations
PageRank
Chenggang Mi104.39
Yating Yang215.14
Zhou Xi31045.17
Lei Wang466.89
Tonghai Jiang514.75