Title
Weakly-supervised Text Classification with Wasserstein Barycenters Regularization.
Abstract
Weakly-supervised text classification aims to train predictive models with unlabeled texts and a few representative words of classes, referred to as category words, rather than labeled texts. These weak supervisions are much more cheaper and easy to collect in real-world scenarios. To resolve this task, we propose a novel deep classification model, namely Weakly-supervised Text Classification with Wasserstein Barycenter Regularization (WTC-WBR). Specifically, we initialize the pseudo-labels of texts by using the category word occurrences, and formulate a weakly self-training framework to iteratively update the weakly-supervised targets by combining the pseudo-labels with the sharpened predictions. Most importantly, we suggest a Wasserstein barycenter regularization with the weakly-supervised targets on the deep feature space. The intuition is that the texts tend to be close to the corresponding Wasserstein barycenter indicated by weakly-supervised targets. Another benefit is that the regularization can capture the geometric information of deep feature space to boost the discriminative power of deep features. Experimental results demonstrate that WTC-WBR outperforms the existing weakly-supervised baselines, and achieves comparable performance to semi-supervised and supervised baselines.
Year
DOI
Venue
2022
10.24963/ijcai.2022/468
European Conference on Artificial Intelligence
Keywords
DocType
Citations 
Machine Learning: Classification,Machine Learning: Weakly Supervised Learning,Natural Language Processing: Text Classification
Conference
0
PageRank 
References 
Authors
0.34
0
4
Name
Order
Citations
PageRank
Jihong OuYang19415.66
Yiming Wang200.34
Ximing Li300.34
Changchun Li401.01