Learning Chinese-Japanese Bilingual Word Embedding by Using Common Characters. - Citegraph

Paper Info

Title
Learning Chinese-Japanese Bilingual Word Embedding by Using Common Characters.

Abstract
Bilingual word embedding, which maps word embedding of two languages into one vector space, has been widely applied in the domain of machine translation, word sense disambiguation and so on. However, no model has been universally accepted for learning bilingual word embedding. In this work, we propose a novel model named CJ-BOC to learn Chinese-Japanese word embeddings. Given Chinese and Japanese share a large portion of common characters, we exploit them in our training process. We demonstrated the effectiveness of such exploitation through theoretical and also experimental study. To evaluate the performance of CJ-BOC, we conducted a comprehensive experiment, which reveals its speed advantage, and high quality of acquired word embeddings as well.

Year	DOI	Venue
2016	10.1007/978-3-319-47650-6_7	Lecture Notes in Artificial Intelligence
Keywords	Field	DocType
Bilingual word embedding,Distributed representation,Common characters,Chinese-Japanese	Vector space,Computer science,Machine translation,Exploit,Speech recognition,Natural language processing,Artificial intelligence,Word embedding,Distributed representation,Word-sense disambiguation	Conference
Volume	ISSN	Citations
9983	0302-9743	0
PageRank	References	Authors
0.34	1	4

Authors (4 rows)

Cited by (0 rows)

References (1 rows)

Name	Order	Citations	PageRank
Wang Jilei	1	0	0.34
Shiying Luo	2	5	3.44
Li Yanning	3	0	0.34
Xia Shu-Tao	4	342	75.29

1