Title
A novel word clustering algorithm based on latent semantic analysis.
Abstract
A new approach is proposed for the clustering of words in a given vocabulary. The method is based on a paradigm first formulated in the context of information retrieval, called latent semantic analysis. This paradigm leads to a parsimonious vector representation of each word in a suitable vector space, where familiar clustering techniques can be applied. The distance measure selected in this space arises naturally from the problem formulation. Preliminary experiments indicate that, the clusters produced are intuitively satisfactory. Because these clusters are semantic in nature, this approach may prove useful as a complement to conventional class-based statistical language modeling techniques.
Year
DOI
Venue
1996
10.1109/ICASSP.1996.540318
ICASSP
Keywords
Field
DocType
preliminary experiment,novel word,suitable vector space,information retrieval,modeling technique,distance measure,latent semantic analysis,conventional class-based statistical language,familiar clustering technique,parsimonious vector representation,new approach,algorithm design and analysis,databases,computer science,stochastic processes,speech recognition,vector space,natural languages,clustering algorithms,probability
Clustering high-dimensional data,Pattern recognition,Correlation clustering,Computer science,Explicit semantic analysis,Document-term matrix,Probabilistic latent semantic analysis,Artificial intelligence,Natural language processing,Cluster analysis,Latent semantic analysis,Semantic computing
Conference
ISBN
Citations 
PageRank 
0-7803-3192-3
54
10.32
References 
Authors
5
5
Name
Order
Citations
PageRank
J. R. Bellegarda126647.41
J. W. Butzberger25410.32
Yen-Lu Chow314870.09
N. B. Coccaro45410.32
D. Naik55410.32