Title
Learning Chinese Word Embeddings from Character Structural Information
Abstract
Word embedding is a basic task in natural language processing area. Unlike English, Chinese subword units, such as characters, radicals, and components, contain rich semantic information which can be used to enhance word embeddings. However, existing methods neglect the semantic contribution of corresponding subword units to the word. In this work, we employ attention mechanism to capture the semantic structure of Chinese words and propose a novel framework, named Attention-based multi-Layer Word Embedding model(ALWE). We also design an asynchronous strategy for updating embedding and attention efficiently. Our model learns to share subword information between distinct words selectively and adaptively. Experimental results on the word similarity, word analogy, and text classification show that the proposed model outperforms all baselines, especially when words do not appear frequently. Qualitative analysis further demonstrates the superiority of ALWE.
Year
DOI
Venue
2020
10.1016/j.csl.2019.101031
Computer Speech & Language
Keywords
Field
DocType
Distributed word representation,Attention mechanism,Chinese semantic structure
Asynchronous communication,Embedding,Computer science,Speech recognition,Semantic information,Word embedding,Analogy
Journal
Volume
ISSN
Citations 
60
0885-2308
1
PageRank 
References 
Authors
0.34
0
5
Name
Order
Citations
PageRank
Bing Ma110911.11
Qi Qi221056.01
Jianxin Liao345782.08
Haifeng Sun46827.77
J. Wang547995.23