Title
Concept-Based Label Distribution Learning for Text Classification
Abstract
Text classification is a crucial task in data mining and artificial intelligence. In recent years, deep learning-based text classification methods have made great development. The deep learning methods supervise model training by representing a label as a one-hot vector. However, the one-hot label representation cannot adequately reflect the relation between an instance and the labels, as labels are often not completely independent, and the instance may be associated with multiple labels in practice. Simply representing the labels as one-hot vectors leads to overconfidence in the model, making it difficult to distinguish some label confusions. In this paper, we propose a simulated label distribution method based on concepts (SLDC) to tackle this problem. This method captures the overlap between the labels by computing the similarity between an instance and the labels and generates a new simulated label distribution for assisting model training. In particular, we incorporate conceptual information from the knowledge base into the representation of instances and labels to address the surface mismatching problem when instances and labels are compared for similarity. Moreover, to fully use the simulated label distribution and the original label vector, we set up a multi-loss function to supervise the training process. Expensive experiments demonstrate the effectiveness of SLDC on five complex text classification datasets. Further experiments also verify that SLDC is especially helpful for confused datasets.
Year
DOI
Venue
2022
10.1007/s44196-022-00144-y
International Journal of Computational Intelligence Systems
Keywords
DocType
Volume
Text classification, Label distribution learning, Concept knowledge base, Graph attention network
Journal
15
Issue
ISSN
Citations 
1
1875-6883
0
PageRank 
References 
Authors
0.34
4
5
Name
Order
Citations
PageRank
Hui Li181492.33
Guimin Huang269.26
Li Yiqun300.34
Zhang Xiaowei400.34
Wang Yabing500.34