Pretraining Multi-modal Representations for Chinese NER Task with Cross-Modality Attention - Citegraph

Paper Info

Title
Pretraining Multi-modal Representations for Chinese NER Task with Cross-Modality Attention

Abstract
ABSTRACTNamed Entity Recognition (NER) aims to identify the pre-defined entities from the unstructured text. Compared with English NER, Chinese NER faces more challenges: the ambiguity problem in entity boundary recognition due to unavailable explicit delimiters between Chinese characters, and the out-of-vocabulary (OOV) problem caused by rare Chinese characters. However, two important features specific to the Chinese language are ignored by previous studies: glyphs and phonetics, which contain rich semantic information of Chinese. To overcome these issues by exploiting the linguistic potential of Chinese as a logographic language, we present MPM-CNER (short for Multi-modal Pretraining Model for Chinese NER), a model for learning multi-modal representations of Chinese semantics, glyphs, and phonetics, via four pretraining tasks: Radical Consistency Identification (RCI), Glyph Image Classification (GIC), Phonetic Consistency Identification (PCI), and Phonetic Classification Modeling (PCM). Meanwhile, a novel cross-modality attention mechanism is proposed to fuse these multimodal features for further improvement. The experimental results show that our method outperforms the state-of-the-art baseline methods on four benchmark datasets, and the ablation study also verifies the effectiveness of the pre-trained multi-modal representations.

Year	DOI	Venue
2022	10.1145/3488560.3498450	WSDM
Keywords	DocType	Citations
Chinese named entity recognition, multi-modal representations, pre-training model, cross-modality attention	Conference	0
PageRank	References	Authors
0.34	0	7

Authors (7 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Chengcheng Mai	1	0	0.68
Mengchuan Qiu	2	0	1.35
Kaiwen Luo	3	0	0.68
Ziyan Peng	4	0	0.68
J. Liu	5	64	15.00
Chunfeng Yuan	6	5	6.90
Yihua Huang	7	8	6.61

1