Title
Pronounce differently, mean differently: A multi-tagging-scheme learning method for Chinese NER integrated with lexicon and phonetic features
Abstract
Named Entity Recognition (NER) aims to automatically extract specific entities from the unstructured text. Compared with performing NER in English, Chinese NER is more challenging in recognizing entity boundaries because there are no explicit delimiters between Chinese characters. However, most previous researches focused on the semantic information of the Chinese language on the character level but ignored the importance of the phonetic characteristics. To address these issues, we integrated phonetic features of Chinese characters with the lexicon information to help disambiguate the entity boundary recognition by fully exploring the potential of Chinese as a pictophonetic language. In addition, a novel multi-tagging-scheme learning method was proposed, based on the multi-task learning paradigm, to alleviate the data sparsity and error propagation problems that occurred in the previous tagging schemes, by separately annotating the segmentation information of entities and their corresponding entity types. Extensive experiments performed on four Chinese NER benchmark datasets: OntoNotes4.0, MSRA, Resume, and Weibo, show that our proposed method consistently outperforms the existing state-of-the-art baseline models. The ablation experiments further demonstrated that the introduction of the phonetic feature and the multi-tagging-scheme has a significant positive effect on the improvement of the Chinese NER task.
Year
DOI
Venue
2022
10.1016/j.ipm.2022.103041
Information Processing & Management
Keywords
DocType
Volume
Named entity recognition,Phonetic feature,Lexicon feature,Multiple tagging schemes,Natural language processing,Information extraction
Journal
59
Issue
ISSN
Citations 
5
0306-4573
0
PageRank 
References 
Authors
0.34
0
7
Name
Order
Citations
PageRank
Chengcheng Mai100.68
Jian Liu200.34
Mengchuan Qiu301.35
Kaiwen Luo400.68
Ziyan Peng500.68
Chunfeng Yuan641830.84
Huang, Yihua716722.07