Abstract | ||
---|---|---|
Chinese sentences are written as sequences of characters, which are elementary units of syntax and semantics. Characters are highly polysemous in forming words. We present a position-sensitive skip-gram model to learn multi-prototype Chinese character embeddings, and explore the usefulness of such character embeddings to Chinese NLP tasks. Evaluation on character similarity shows that multi-prototype embeddings are significantly better than a single-prototype baseline. In addition, used as features in the Chinese NER task, the embeddings result in a 1.74% F-score improvement over a state-of-the-art baseline. |
Year | Venue | Keywords |
---|---|---|
2016 | LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION | embedding,multi-prototype,Chinese character |
Field | DocType | Citations |
Embedding,Computer science,Speech recognition,Natural language processing,Artificial intelligence | Conference | 2 |
PageRank | References | Authors |
0.37 | 16 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Yanan Lu | 1 | 97 | 4.02 |
Yue Zhang | 2 | 1364 | 114.17 |
Donghong Ji | 3 | 892 | 120.08 |