Title
Learning Chinese-Japanese Bilingual Word Embedding by Using Common Characters.
Abstract
Bilingual word embedding, which maps word embedding of two languages into one vector space, has been widely applied in the domain of machine translation, word sense disambiguation and so on. However, no model has been universally accepted for learning bilingual word embedding. In this work, we propose a novel model named CJ-BOC to learn Chinese-Japanese word embeddings. Given Chinese and Japanese share a large portion of common characters, we exploit them in our training process. We demonstrated the effectiveness of such exploitation through theoretical and also experimental study. To evaluate the performance of CJ-BOC, we conducted a comprehensive experiment, which reveals its speed advantage, and high quality of acquired word embeddings as well.
Year
DOI
Venue
2016
10.1007/978-3-319-47650-6_7
Lecture Notes in Artificial Intelligence
Keywords
Field
DocType
Bilingual word embedding,Distributed representation,Common characters,Chinese-Japanese
Vector space,Computer science,Machine translation,Exploit,Speech recognition,Natural language processing,Artificial intelligence,Word embedding,Distributed representation,Word-sense disambiguation
Conference
Volume
ISSN
Citations 
9983
0302-9743
0
PageRank 
References 
Authors
0.34
1
4
Name
Order
Citations
PageRank
Wang Jilei100.34
Shiying Luo253.44
Li Yanning300.34
Xia Shu-Tao434275.29