Title
Pairwise clustering with t-PLSI
Abstract
In the past decade, Probabilistic Latent Semantic Indexing (PLSI) has become an important modeling technique, widely used in clustering or graph partitioning analysis. However, the original PLSI is designed for multinomial data and may not handle other data types. To overcome this restriction, we generalize PLSI to t-exponential family based on a recently proposed information criterion called t-divergence. The t-divergence enjoys more flexibility than KL-divergence in PLSI such that it can accommodate more types of noise in data. To optimize the generalized learning objective, we propose a Majorization-Minimization algorithm which multiplicatively updates the factorizing matrices. The new method is verified in pairwise clustering tasks. Experimental results on real-world datasets show that PLSI with t-divergence can improve clustering performance in purity for certain datasets.
Year
DOI
Venue
2012
10.1007/978-3-642-33266-1_51
ICANN (2)
Keywords
Field
DocType
real-world datasets,data type,multinomial data,pairwise clustering,certain datasets,original plsi,clustering performance,probabilistic latent semantic indexing,pairwise clustering task,majorization-minimization algorithm,approximation,divergence,clustering
Pairwise comparison,Data mining,Pattern recognition,Computer science,Matrix (mathematics),Multinomial distribution,Data type,Artificial intelligence,Probabilistic latent semantic analysis,Graph partition,Cluster analysis,Machine learning
Conference
Citations 
PageRank 
References 
0
0.34
8
Authors
4
Name
Order
Citations
PageRank
He Zhang1676.58
Tele Hao2493.25
Zhirong Yang328917.27
Erkki Oja46701797.08