Title
A new vector space model exploiting semantic correlations of social annotations for web page clustering
Abstract
Text clustering can effectively improve search results and user experience of information retrieval system. Traditional text clustering approaches are based on vector space model, in which a document is represented as a vector using term frequency based weighting scheme. The main disadvantage of this model is that it cannot fully exploit semantic correlations between social annotations and document contents because term frequency based weighting scheme only captures the number of occurrences of terms in the document. However, social annotation of web pages implicates fundamental and valuable semantic information thus can be fully utilized to improve information retrieval system. In this paper, we investigate and evaluate several extended vector space models which can combine social annotation and web page text. In particular, we propose a novel vector space model by computing the semantic correlations between social annotations and web page words. Comparing with other vector space models, our experiments show that using semantic correlations between social tags and web page words improves the clustering accuracy with RI score increase of 4% - 7%.
Year
DOI
Venue
2011
10.1007/978-3-642-23535-1_11
WAIM
Keywords
Field
DocType
new vector space model,information retrieval system,semantic correlation,extended vector space model,web page word,vector space model,social annotation,novel vector space model,web page clustering,weighting scheme,term frequency,social tag
Data mining,Annotation,Information retrieval,Web page,Computer science,Document clustering,Image retrieval,Explicit semantic analysis,Social Semantic Web,Vector space model,Cluster analysis
Conference
Volume
ISSN
Citations 
6897
0302-9743
0
PageRank 
References 
Authors
0.34
19
6
Name
Order
Citations
PageRank
Xiwu Gu15914.39
Xianbing Wang2909.98
Ruixuan Li340569.47
Kunmei Wen47011.67
Yu-Fei Yang512312.15
Weijun Xiao617520.53