Abstract | ||
---|---|---|
To increase vocabulary overlap in spoken Uyghur neural machine translation (NMT), we propose a novel method to enhance the common used subword units based segmentation method. In particular, we apply a log-linear model as the main framework and integrate several features such as subword, morphological information, bilingual word alignment and monolingual language model into it. Experimental results show that spoken Uyghur segmentation with our proposed method improves the performance of the spoken Uyghur-Chinese NMT significantly (yield up to 1.52 BLEU improvements). |
Year | DOI | Venue |
---|---|---|
2018 | 10.1109/ICTAI.2018.00018 | 2018 IEEE 30th International Conference on Tools with Artificial Intelligence (ICTAI) |
Keywords | Field | DocType |
spoken Uyghur segmentation,neural machine translation,BPE,morphologically-rich,low-resource language | Task analysis,Segmentation,Computer science,Machine translation,Feature extraction,Natural language processing,Artificial intelligence,Vocabulary,Language model,Machine learning | Conference |
ISSN | ISBN | Citations |
1082-3409 | 978-1-5386-7450-5 | 0 |
PageRank | References | Authors |
0.34 | 4 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Chenggang Mi | 1 | 0 | 4.39 |
Yating Yang | 2 | 1 | 5.14 |
Zhou Xi | 3 | 104 | 5.17 |
Lei Wang | 4 | 6 | 6.89 |
Tonghai Jiang | 5 | 1 | 4.75 |