Title
Multi-task Hierarchical Cross-Attention Network for Multi-label Text Classification
Abstract
As the quantity of scientific publications grows significantly, manual indexing of literature becomes increasingly complex, and researchers have attempted to utilize techniques in Hierarchical Multi-label Text Classification (HMTC) to classify scientific literature. Although there have been many advances, some problems still cannot be effectively solved in HMTC tasks, such as the difficulty in capturing the dependencies of hierarchical labels and the correlation between labels and text, and the lack of adaptability of models to specialized text. In this paper, we propose a novel framework called Multi-task Hierarchical Cross-Attention Network (MHCAN) for multi-label text classification. Specifically, we introduce a cross-attention mechanism to fully incorporate text representation and hierarchical labels with a directed acyclic graph (DAG) structure, and design an iterative hierarchical-attention module to capture the dependencies between layers. Afterwards, our framework weighting jointly optimizes each level of loss. To improve the adaptability of the model to domain data, we also continue to pre-train SciBERT on unlabeled data and introduce adversarial training. Our framework ranks 2nd in NLPCC 2022 Shared Task 5 Track 1 (Multi-label Classification Model for English Scientific Literature). The experimental results show the effectiveness of the modules applied in this framework.
Year
DOI
Venue
2022
10.1007/978-3-031-17189-5_13
NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2022, PT II
Keywords
DocType
Volume
Hierarchical multi-label text classification, Multi-task learning, Attention mechanism
Conference
13552
ISSN
Citations 
PageRank 
0302-9743
0
0.34
References 
Authors
0
8
Name
Order
Citations
PageRank
Junyu Lu100.34
Hao Zhang200.34
Zhexu Shen300.34
Kaiyuan Shi400.68
Liang Yang500.34
Bo Xu600.34
Shaowu Zhang711.36
Hongfei Lin802.37