Short Text Clustering with a Deep Multi-embedded Self-supervised Model - Citegraph

Paper Info

Title
Short Text Clustering with a Deep Multi-embedded Self-supervised Model

Abstract
Short text clustering is challenging in the field of Natural Language Processing (NLP) since it is hard to learn the discriminative representations with limited information. In this paper, fused multi-embedded features are employed to enhance the representations of short texts. Then, a denoising autoencoder with an attention layer is adopted to extract low-dimensional features from the multi-embeddings against the disturbance of noisy texts. Furthermore, we propose a novel distribution estimation with jointly utilizing soft cluster assignment and the prior target distribution transition to better fine-tune the encoder. Combining the above work, we propose a deep multi-embedded self-supervised model(DMESSM) for short text clustering. We compare our DMESSM with the state-of-the-art methods in head-to-head comparisons on benchmark datasets, which indicates that our method outperforms them.

Year	DOI	Venue
2021	10.1007/978-3-030-86383-8_12	ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2021, PT V
Keywords	DocType	Volume
Short text clustering, Autoencoder, Self-supervised clustering, Attention, Distribution estimation	Conference	12895
ISSN	Citations	PageRank
0302-9743	0	0.34
References	Authors
0	5

Authors (5 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Kai Zhang	1	0	0.68
Zheng Lian	2	12	8.33
Jiangmeng Li	3	0	1.69
Haichang Li	4	0	2.03
Xiaohui Hu	5	17	8.10

1